WO2024027583A1 - Procédé et appareil de traitement d'images, et dispositif électronique ainsi que support de stockage lisible - Google Patents

Procédé et appareil de traitement d'images, et dispositif électronique ainsi que support de stockage lisible Download PDF

Info

Publication number
WO2024027583A1
WO2024027583A1 PCT/CN2023/109785 CN2023109785W WO2024027583A1 WO 2024027583 A1 WO2024027583 A1 WO 2024027583A1 CN 2023109785 W CN2023109785 W CN 2023109785W WO 2024027583 A1 WO2024027583 A1 WO 2024027583A1
Authority
WO
WIPO (PCT)
Prior art keywords
map
brightness
text content
difference
brightness map
Prior art date
Application number
PCT/CN2023/109785
Other languages
English (en)
Chinese (zh)
Inventor
朱泽基
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2024027583A1 publication Critical patent/WO2024027583A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/92Dynamic range modification of images or parts thereof based on global image properties
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document

Definitions

  • This application belongs to the field of artificial intelligence technology, and specifically relates to an image processing method, device, electronic equipment and readable storage medium.
  • some text content such as some product information
  • the text content in the image may not be clear enough due to factors such as the shooting environment and shooting angle. Therefore, the user needs to process the image to make the text content in the image clearer.
  • the purpose of the embodiments of the present application is to provide an image processing method that can solve the problem in the prior art that the image quality is reduced due to changing the properties of the image itself when improving the clarity of the text content in the image.
  • embodiments of the present application provide an image processing method, which method includes obtaining a first brightness map corresponding to a first image and a first color difference map corresponding to the first image; obtaining the first brightness map the first characteristic information of the text content in the first brightness map, and adjust the clarity of the text content in the first brightness map according to the first feature information to obtain the adjusted first brightness map; according to the first The color difference map, and the adjusted first brightness map, generate a second image.
  • embodiments of the present application provide an image processing device, which includes: an acquisition module for acquiring a first brightness map corresponding to a first image and a first color difference map corresponding to the first image; adjusting A module configured to obtain the first characteristic information of the text content in the first brightness map, and adjust the clarity of the text content in the first brightness map according to the first feature information to obtain the adjusted The first brightness map; a generating module configured to generate a second image based on the first color difference map and the adjusted first brightness map.
  • inventions of the present application provide an electronic device.
  • the electronic device includes a processor and a memory.
  • the memory stores programs or instructions that can be run on the processor.
  • the programs or instructions are processed by the processor.
  • the processor is executed, the steps of the method described in the first aspect are implemented.
  • embodiments of the present application provide a readable storage medium.
  • Programs or instructions are stored on the readable storage medium.
  • the steps of the method described in the first aspect are implemented. .
  • inventions of the present application provide a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the first aspect. the method described.
  • embodiments of the present application provide a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the method as described in the first aspect.
  • embodiments of the present application provide an electronic device configured to perform the method described in the first aspect.
  • the first brightness map and the first color difference map corresponding to the first image are respectively obtained.
  • the first feature information of the text content is extracted from the first brightness map, so as to adjust the clarity of the text content according to the first feature information to make the text content clear.
  • a second image is generated.
  • the second image is generated based on the original color difference of the first image and can retain the color of the original image. It can be seen that based on the embodiment of the present application, only the brightness of the image is processed, and the color difference of the image is not processed. On the basis of ensuring the clarity of the text content in the image, the color of the image will not be changed, thereby ensuring the clarity of the image. high quality.
  • Figure 1 is a flow chart of an image processing method according to an embodiment of the present application.
  • FIGS. 2 to 4 are model schematic diagrams of embodiments of the present application.
  • Figures 5 to 8 are schematic diagrams illustrating the image processing method according to the embodiment of the present application.
  • FIG. 9 is a block diagram of an image processing device according to an embodiment of the present application.
  • Figure 10 is one of the schematic diagrams of the hardware structure of the electronic device according to the embodiment of the present application.
  • FIG. 11 is a second schematic diagram of the hardware structure of an electronic device according to an embodiment of the present application.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and that "first,”"second,” etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple.
  • “and/or” in the description and claims means at least one of the connected objects, The character “/" generally indicates that the related objects are in an "or” relationship.
  • the execution subject of the image processing method provided by the embodiments of the present application may be the image processing device provided by the embodiments of the present application, or an electronic device integrating the image processing device, where the image processing device may be implemented in hardware or software.
  • Figure 1 shows a flow chart of an image processing method according to an embodiment of the present application. An example is taken of applying this method to an electronic device, including:
  • Step 110 Obtain the first brightness map corresponding to the first image and the first color difference map corresponding to the first image.
  • the electronic device takes an image of the target scene as the first image.
  • the target scene includes text content.
  • text content is displayed on a large screen, and the user takes a picture of the large screen; for another example, the content on a certain paper product manual is: Text content, user photographed the product manual.
  • the text content in the first image has clarity problems, such as surrounding blur, color change, noise, etc.
  • this step based on the captured first image, corresponding image processing is performed to obtain the first brightness map and the first color difference map of the first image.
  • the first brightness map is used to reflect the brightness information of each pixel of the first image
  • the first color difference map is used to reflect the color difference information of each pixel of the first image.
  • the data format of the first image is red (Reb, R) green (Green, G) blue (Blue, B) format.
  • R, G, and B are used to represent the R color channel value, G color channel value, and B color channel value of the pixel respectively.
  • Step 120 Obtain the first feature information of the text content in the first brightness map, and adjust the clarity of the text content in the first brightness map based on the first feature information to obtain the adjusted first brightness map.
  • the first feature information of the text content is obtained based on the first brightness map.
  • the first feature information includes stroke information of the text content, outline information of the text content, overall information of the text content; and so on.
  • the first feature information is a feature map.
  • the clarity of the text content is adjusted to improve the clarity of the text. Clarity of text content.
  • this embodiment adjusts the clarity of the text content to the clarity of the text content under an ideal state. For example, based on professional equipment, under ideal circumstances, the clarity of text content in images captured can be achieved.
  • Step 130 Generate a second image based on the first color difference map and the adjusted first brightness map.
  • a second image can be obtained.
  • the second image has the same color difference information as the first image, that is, the color is the same.
  • the difference is that in the second image
  • the text content is clearer than the text content in the first image.
  • the first brightness map and the first color difference map corresponding to the first image are respectively obtained.
  • the first feature information of the text content is extracted from the first brightness map, so as to adjust the clarity of the text content according to the first feature information to make the text content clear.
  • a second image is generated.
  • the second image is generated based on the original color difference of the first image and can retain the color of the original image. It can be seen that based on the embodiment of the present application, only the brightness of the image is processed, and the color difference of the image is not processed. On the basis of ensuring the clarity of the text content in the image, the color of the image will not be changed, thereby ensuring the clarity of the image. high quality.
  • step 120 includes:
  • Sub-step A1 Split the first brightness image into N1 sub-brightness images, where N1 is a positive integer.
  • the first brightness image is cropped into N1 sub-brightness images according to a preset order.
  • the image size is: image width (W) x image height (H).
  • the size of the general captured image is "3000x4000” and the size of the sub-brightness image is "512x512". Images of this size are more conducive to capturing text content. Among them, the greater the number of cropped sub-luminance images, the more refined the processing, but it will also occupy computing resources and limit computing efficiency.
  • Sub-step A2 Obtain the characteristic information of the text content in N1 sub-brightness images.
  • Sub-step A3 According to the characteristic information of the text content in the N1 sub-luminance images, correspondingly adjust the clarity of the text content in the N1 sub-luminance images.
  • the clarity of its text content is adjusted based on the extracted feature information to improve the clarity of the text content in each sub-luminance map.
  • Sub-step A4 Synthesize the adjusted first brightness map based on the adjusted N1 sub-brightness maps.
  • the first brightness map is restored according to a preset order.
  • a large picture is split into multiple small pictures, so as to improve the text content clarity of each small picture.
  • This adjustment plan is more detailed to ensure the clarity of the text content in the entire picture after adjustment.
  • This embodiment can also greatly reduce the occupation of memory and computing resources.
  • step 120 includes:
  • Sub-step B1 Based on N2 pieces of size information, adjust the size of the first brightness map N2 times to obtain N2 first brightness maps, where N2 is a positive integer.
  • the size of the first brightness map can be adjusted to obtain the feature information of the first brightness maps of different sizes, and then the feature information is fused to obtain the final fusion
  • the obtained feature information is used as the first feature information to ensure that the obtained feature information is more comprehensive.
  • a sub-brightness map is used as a processing object.
  • the text feature extraction module extracts feature information of the text content.
  • the text feature extraction module is shown in the figure.
  • a "3x3" convolutional layer plus activation operation (indicated by the arrow in the figure)
  • This will not enlarge the receptive field too much and allow the network to focus on the strokes of the text.
  • the general text size is more than twelve pixels.
  • the "3x3" convolution operation can calculate the adjacent "3x3" pixel area during calculation.
  • the text can be better extracted through the four-layer 3x3 convolution operation in Figure 2.
  • Content characteristic information is used to represent the feature map corresponding to the input sub-luminance map
  • the feature map 202 is used to represent the feature map corresponding to the output feature information.
  • the input and output feature maps of the text feature extraction module are the same size, both [W, H, C].
  • W is used to represent the width of the feature map
  • H is used to represent the height of the feature map.
  • the values of W and H are consistent
  • C represents the number of channels of the feature map, usually "8, 16, 32".
  • the values of W corresponding to the adjusted multiple sizes are "512, 256, 128", and the feature maps of the above sizes can Better capture of text content.
  • the feature map can be scaled to different sizes after undergoing different types of convolution operations to obtain feature maps of different sizes.
  • Sub-step B2 Obtain the characteristic information of the text content in the N2 first brightness images.
  • multiple corresponding text feature extraction modules can be used. Different text feature extraction modules process feature maps with different sizes.
  • the feature map sizes processed by the three text feature extraction modules 302 are: [W, H, C], [W/2, H/2, C], [W/4, H/4, C].
  • Sub-step B3 Fusion of the feature information of the text content in the N2 first brightness images to obtain the first feature information.
  • the multiple output feature maps are upsampled to the original size.
  • upsampling is performed through interpolation, which requires less calculation and is efficient.
  • FIG. 3 first directly perform a feature concatenation (concat) operation on the two feature maps 303 corresponding to the sizes [256, 256, C] and [128, 128, C] to obtain a feature map 304 with a size of [ 512, 512, 2C], and then fuse the feature map 304 with the output feature map of size [512, 512, C] through the concat operation, so that the feature information of different dimensions of the text content can be effectively preserved.
  • the final output size is [512, 512, 3C] feature map 305.
  • the feature information of the text content can be extracted based on feature maps of multiple sizes. This can handle the details of the text strokes and the entire text content well, which is beneficial to the clarity of the text content. Make adjustments.
  • the multi-size feature map fusion steps provided in the previous embodiment can be repeated, thereby connecting multiple fusion steps in series to increase the depth of the network and improve the expression of the model. ability.
  • the sub-brightness map 401 corresponds to before adjusting the sharpness
  • the sub-brightness map 402 corresponds to after adjusting the sharpness
  • the size of the feature map output by each multi-size feature fusion module is [512, 512, 3C].
  • the transformed size is [512, 512, C].
  • the length and width of the feature map here are both "512", which is consistent with the size of the input feature map.
  • the output feature map contains text details and overall text features. Since the network has increased depth, the final output feature map has a larger feature receptive field, and its feature map is similar to The input sizes are consistent, so the final output feature map can also take into account global brightness, contrast and other information, and its size is [512, 512, C].
  • the concat operation is used in this application for the rapid convergence of the model and a jump link between the input and the output.
  • the specific operation is: put two feature maps of size [512, 512, C] according to the The three dimensions are directly spliced into a thicker feature map with size [512, 512, 2C].
  • the input image needs to go through the convolution layer first, and its feature size is modified to [512, 512, C].
  • this embodiment on the one hand, after multiple feature extractions of multi-size feature maps, more detailed features of the text content can be extracted; on the other hand, combined with the characteristic information in the input original feature map, the final result is The characteristic information of the text content is more comprehensive. It can be seen that this embodiment uses the characteristics of the text image to extract the detailed features of the text content and reconstruct the text details, which is beneficial to improving the clarity of the text.
  • step 120 includes:
  • Sub-step C1 Obtain the second feature information of the text content in the first brightness map.
  • Sub-step C2 According to the second feature information, adjust the clarity of the text content in the first brightness map to obtain a second brightness map.
  • this embodiment may use the Document Super Definition Network (DocSR) for image processing to output the second image.
  • DocSR Document Super Definition Network
  • a model corresponding to the document ultra-clear network can be trained.
  • the model outputs the prediction map P1.
  • the prediction map P1 is obtained by adjusting the clarity of the text content based on the second feature information.
  • the prediction map P1 corresponds to a sub-luminance map.
  • Sub-step C3 Obtain a third brightness map corresponding to the first brightness map, the first brightness map pair corresponds to the pixels of the third brightness image, and the clarity of the text content in the third brightness map meets the first preset condition.
  • shooting is performed based on the same scene to obtain a corresponding third brightness map based on the captured image.
  • use professional equipment for shooting For example, you can use equipment with better optical properties, such as SLR shooting. Such equipment is generally larger and has better optical imaging effects.
  • the clarity of the text content can reach the first preset condition.
  • the first preset condition is the clarity of the text content in the captured image under ideal shooting conditions.
  • Sub-step C4 Obtain the difference information between the second brightness map and the third brightness map.
  • the difference values of all areas in a brightness map are used as the difference information in this step.
  • Sub-step C5 When the difference information satisfies the second preset condition, determine the acquired second feature information as the first feature information.
  • the second preset condition is a minimum value.
  • each sub-luminance image in a brightness image is used as a training sample to complete model training.
  • the number of small images can be increased to ensure the diversity of samples. change.
  • the second feature information output by the model is relatively accurate and can be used as the first feature information, and the first feature information is used in actual image processing.
  • the model is trained so that in subsequent usage scenarios, the clarity of the text content in the adjusted first brightness map output by the model is consistent with The clarity of the text content in the ideal image is close to that of the ideal image.
  • step C4 includes:
  • Sub-step D1 Obtain the first absolute value of the difference between the two corresponding pixel values in the second brightness map and the third brightness map, obtain the corresponding first mean value based on the first absolute value, and use the first mean value as the first mean value.
  • One difference Obtain the first absolute value of the difference between the two corresponding pixel values in the second brightness map and the third brightness map, obtain the corresponding first mean value based on the first absolute value, and use the first mean value as the first mean value.
  • L 1 is used to represent the first difference between p1 and y2;
  • the first difference value takes effect on the entire image p1 to remove noise in the entire image and improve the overall details.
  • Sub-step D2 Obtain the first stroke gradient map corresponding to the second brightness map, the second stroke gradient map corresponding to the third brightness map, and obtain the two pixel values corresponding to the first stroke gradient map and the second stroke gradient map. the second absolute value of the difference, obtain the corresponding second mean value according to the second absolute value, and use the second mean value as the second difference value.
  • values are supplemented in the four directions of up, down, left, and right, and the supplemented value area is cropped at the same time.
  • the left and right cropped complement areas are subtracted, the up and down cropped complement areas are subtracted, and the obtained results are superimposed, and finally the stroke gradient map is obtained.
  • L 2 is used to represent the second difference between p1 and y2; grad1 and grad2 are used to represent the stroke gradient maps of p1 and y2 respectively; cnt is used to represent the number of pixels of p1 or y2.
  • the background area of the image is relatively flat and the text area has a relatively obvious gradient.
  • the edge area of the text is found through shift subtraction, and the gradient difference between the predicted image and the ideal image is calculated. .
  • Sub-step D3 According to the second stroke gradient map, expand the first area corresponding to the text content into a second area, identify Identify the third area corresponding to the second area in the second brightness map, and obtain the third absolute value of the difference between the two pixel values corresponding to the second area and the third area, and obtain the corresponding third area based on the third absolute value. Three means, and use the third mean as the third difference.
  • the first area where the text content is located can be found, so that the first area can be expanded to obtain the expanded second area.
  • the first region is expanded.
  • L 3 is used to represent the third difference between p1 and y2; mask_cnt is used to represent the number of pixels in the second area of y2.
  • the difference between the predicted image and the ideal image is calculated based on the text area, thereby constraining the convergence of the entire model to focus on the text area, which in turn helps improve the clarity of the text.
  • Sub-step D4 Perform weighting processing on the first difference, the second difference and the third difference to obtain difference information.
  • sum_loss is used to represent the difference information between p1 and y2.
  • a, b, c are the weights of each difference.
  • the values of a, b, c ensure that L 1 , L 2 , L 3 are in On the same order of magnitude.
  • sum_loss is backpropagated and sum_loss is calculated in a loop, so that the model is trained in the direction of minimum sum_loss. After one hundred epoch iterations, the output p1 and the ideal y2 are relatively similar.
  • the trained model can effectively enhance the details of text strokes to improve the clarity of text, thereby improving image quality.
  • this application can improve the clarity of the captured text content in scenes where text content is captured without affecting the color information of the entire image.
  • this application comprehensively considers: the strokes of the text are independent of each other, the blur of the text is often reflected in the unclear strokes of a single text, and the proportion of a single text in the whole picture. Small, non-text areas tend to have image characteristics such as relatively flat and uniform, and a suitable network model is designed to process the captured images.
  • the execution subject may be an image processing device.
  • an image processing device executing an image processing method is used as an example to describe the image processing device provided by the embodiments of the present application.
  • FIG. 9 shows a block diagram of an image processing device according to another embodiment of the present application.
  • the device includes:
  • the acquisition module 10 is used to acquire the first brightness map corresponding to the first image and the first color difference map corresponding to the first image;
  • the adjustment module 20 is used to obtain the first characteristic information of the text content in the first brightness map, and adjust the clarity of the text content in the first brightness map according to the first feature information to obtain the adjusted first brightness map;
  • the generation module 30 is configured to generate a second image based on the first color difference map and the adjusted first brightness map.
  • the first brightness map and the first color difference map corresponding to the first image are respectively obtained.
  • the first feature information of the text content is extracted from the first brightness map, so as to adjust the clarity of the text content according to the first feature information to make the text content clear.
  • a second image is generated.
  • the second image is generated based on the original color difference of the first image and can retain the color of the original image. It can be seen that based on the embodiment of the present application, only the brightness of the image is processed, and the color difference of the image is not processed. On the basis of ensuring the clarity of the text content in the image, the color of the image will not be changed, thereby ensuring the clarity of the image. high quality.
  • the adjustment module 20 includes:
  • the splitting unit is used to split the first brightness map into N1 sub-brightness maps, where N1 is a positive integer;
  • the first acquisition unit is used to acquire the characteristic information of the text content in N1 sub-luminance images
  • the first adjustment unit is used to correspondingly adjust the clarity of the text content in the N1 sub-brightness images based on the characteristic information of the text content in the N1 sub-brightness images;
  • the synthesis unit is used to synthesize the adjusted first brightness map based on the adjusted N1 sub-brightness maps.
  • the adjustment module 20 includes:
  • the second adjustment unit is used to adjust the size of the first brightness map N2 times based on N2 pieces of size information to obtain N2 first brightness maps, where N2 is a positive integer;
  • the second acquisition unit is used to acquire the characteristic information of the text content in the N2 first brightness images
  • the fusion unit is used to fuse the feature information of the text content in the N2 first brightness images to obtain the first feature information.
  • the adjustment module 20 includes:
  • the third acquisition unit is used to acquire the second characteristic information of the text content in the first brightness map
  • the third adjustment unit is used to adjust the clarity of the text content in the first brightness map according to the second feature information to obtain the second brightness map;
  • the fourth acquisition unit is used to acquire a third brightness map corresponding to the first brightness map, the first brightness map pair corresponds to the pixels of the third brightness image, and the clarity of the text content in the third brightness map meets the first preset condition;
  • the fifth acquisition unit is used to acquire the difference information between the second brightness map and the third brightness map
  • the determining unit is configured to determine the acquired second feature information as the first feature information when the difference information satisfies the second preset condition.
  • the fifth acquisition unit includes:
  • the first obtaining subunit is used to obtain the first absolute value of the difference between the two corresponding pixel values in the second brightness map and the third brightness map, obtain the corresponding first mean value according to the first absolute value, and obtain the first absolute value of the difference between the second brightness map and the third brightness map.
  • a mean value is used as the first difference;
  • the second acquisition subunit is used to obtain the first stroke gradient map corresponding to the second brightness map, the second stroke gradient map corresponding to the third brightness map, and obtain the corresponding first stroke gradient map and the second stroke gradient map.
  • the third acquisition subunit is used to expand the first area corresponding to the text content into a second area according to the second stroke gradient map, identify the third area corresponding to the second area in the second brightness map, and obtain the second area
  • the third absolute value of the difference between the two pixel values corresponding to the third area obtains the corresponding third mean value based on the third absolute value, and uses the third mean value as the third difference value;
  • the weighting subunit is used to weight the first difference, the second difference and the third difference to obtain difference information.
  • the image processing device in the embodiment of the present application may be an electronic device or a component in the electronic device, such as an integrated circuit or a chip.
  • the electronic device may be a terminal or other devices other than the terminal.
  • the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle-mounted electronic device, a mobile Internet device (MID), or an augmented reality (Augmented Reality, AR)/virtual reality (Virtual Reality, VR) ) equipment, robots, wearable devices, ultra-mobile personal computers (Ultra-Mobile Personal Computer, UMPC), netbooks or personal digital assistants (Personal Digital Assistant, PDA), etc., and can also be servers, network attached storage (Network Attached Storage, NAS), personal computer (Personal Computer, PC), television (TeleVision, TV), teller machine or self-service machine, etc., the embodiments of this application are not specifically limited.
  • the image processing device in the embodiment of the present application may be a device with an action system.
  • the action system can be an Android action system, an ios action system, or other possible action systems, which are not specifically limited in the embodiments of this application.
  • the image processing device provided by the embodiments of the present application can implement various processes implemented by the above method embodiments. To avoid duplication, they will not be described again here.
  • this embodiment of the present application also provides an electronic device 100, including a processor 101, a memory 102, and programs or instructions stored on the memory 102 and executable on the processor 101.
  • a processor 101 When the program or instruction is executed by the processor 101, each step of any of the above image processing method embodiments is implemented, and the same technical effect can be achieved. To avoid duplication, the details will not be described here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • Figure 11 is a schematic diagram of the hardware structure of an electronic device that implements an embodiment of the present application.
  • the electronic device 1000 includes but is not limited to: radio frequency unit 1001, network module 1002, audio output unit 1003, input unit 1004, sensor 1005, display unit 1006, user input unit 1007, interface unit 1008, memory 1009, processor 1010, etc. part.
  • the electronic device 1000 may also include a power supply (such as a battery) that supplies power to various components.
  • the power supply may be logically connected to the processor 1010 through a power management system, thereby managing charging, discharging, and function through the power management system. Consumption management and other functions.
  • the structure of the electronic device shown in Figure 11 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than shown in the figure, or combine certain components, or arrange different components, which will not be described again here. .
  • the processor 1010 is used to obtain the first brightness map corresponding to the first image and the first color difference map corresponding to the first image; obtain the first characteristic information of the text content in the first brightness map, and According to the first characteristic information, adjust the clarity of the text content in the first brightness map to obtain the adjusted first brightness map; according to the first color difference map, and the adjusted first brightness map Luminance map to generate a second image.
  • the first brightness map and the first color difference map corresponding to the first image are respectively obtained.
  • the first feature information of the text content is extracted from the first brightness map, so as to adjust the clarity of the text content according to the first feature information to make the text content clear.
  • a second image is generated.
  • the second image is generated based on the original color difference of the first image and can retain the color of the original image. It can be seen that based on the embodiment of the present application, only the brightness of the image is processed, and the color difference of the image is not processed. On the basis of ensuring the clarity of the text content in the image, the color of the image will not be changed, thereby ensuring the clarity of the image. high quality.
  • the processor 1010 is also configured to split the first brightness map into N1 sub-brightness maps, where N1 is a positive integer; obtain the characteristic information of the text content in the N1 sub-brightness maps; according to the The characteristic information of the text content in the N1 sub-luminance pictures corresponds to adjusting the clarity of the text content in the N1 sub-luminance pictures; according to the adjusted N1 sub-luminance pictures, synthesize the adjusted third A brightness map.
  • the processor 1010 is also configured to adjust the size of the first brightness map N2 times based on N2 pieces of size information to obtain N2 pieces of the first brightness map, where N2 is a positive integer; obtain the The characteristic information of the text content in the N2 first brightness pictures; the characteristic information of the text content in the N2 first brightness pictures is fused to obtain the first characteristic information.
  • the processor 1010 is also configured to obtain second characteristic information of the text content in the first brightness map; and adjust the clarity of the text content in the first brightness map according to the second characteristic information. , obtain a second brightness map; obtain a third brightness map corresponding to the first brightness map, the first brightness map pair corresponds to the third brightness image pixel, and the text content in the third brightness map The clarity satisfies the first preset condition; the difference information between the second brightness map and the third brightness map is obtained; when the difference information satisfies the second preset condition, the acquired The second characteristic information is determined to be the first characteristic information.
  • the processor 1010 is further configured to obtain a first absolute value of the difference between the two corresponding pixel values in the second brightness map and the third brightness map, and obtain the first absolute value according to the first absolute value. corresponds to the first mean, and the a mean value as the first difference; obtain the first stroke gradient map corresponding to the second brightness map, the second stroke gradient map corresponding to the third brightness map, and obtain the first stroke gradient map and the third stroke gradient map.
  • the second absolute value of the difference between the two corresponding pixel values in the two-stroke gradient map obtain the corresponding second mean value according to the second absolute value, and use the second mean value as the second difference value; according to the second absolute value
  • the second stroke gradient map is used to expand the first area corresponding to the text content into a second area, identify the third area corresponding to the second area in the second brightness map, and obtain the relationship between the second area and the second stroke gradient map.
  • the third absolute value of the difference between the two pixel values corresponding to the third area obtain the corresponding third mean value according to the third absolute value, and use the third mean value as the third difference value; for the The first difference, the second difference and the third difference are weighted to obtain the difference information.
  • this application can improve the clarity of the captured text content in scenes where text content is captured without affecting the color information of the entire image.
  • this application comprehensively considers: the strokes of text are independent of each other, the blur of text is often reflected in the unclear strokes of a single text, the proportion of a single text in the entire image is relatively small, and non-text areas are often relatively flat and uniform. and other image features, a suitable network model is designed to process the captured images.
  • the input unit 1004 may include a graphics processor (Graphics Processing Unit, GPU) 10041 and a microphone 10042.
  • the graphics processor 10041 is used for recording data generated by an image capture device in a video image capture mode or an image capture mode. (such as a camera) to process the image data of still pictures or video images.
  • the display unit 1006 may include a display panel 10061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1007 includes at least one of a touch panel 10071 and other input devices 10072 .
  • Touch panel 10071 also known as touch screen.
  • the touch panel 10071 may include two parts: a touch detection device and a touch controller.
  • Other input devices 10072 may include but are not limited to physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and action sticks, which will not be described again here.
  • Memory 1009 may be used to store software programs as well as various data, including but not limited to application programs and action systems.
  • the processor 1010 can integrate an application processor and a modem processor.
  • the application processor mainly processes action systems, user pages, application programs, etc.
  • the modem processor mainly processes wireless communications. It can be understood that the above modem processor may not be integrated into the processor 1010.
  • Memory 1009 may be used to store software programs as well as various data.
  • the memory 1009 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, Image playback function, etc.) etc.
  • memory 1009 may include volatile memory or nonvolatile memory, or memory 1009 may include both volatile and nonvolatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory.
  • Volatile memory can be random access memory (Random Access Memory, RAM), static random access memory (Static RAM, SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory Access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DRRAM).
  • RAM Random Access Memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • synchronous dynamic random access memory Synchronous DRAM, SDRAM
  • Double data rate synchronous dynamic random access memory Access memory Double Data Rate SDRAM, DDRSDRAM
  • Enhanced SDRAM, ESDRAM synchronous dynamic random access memory
  • Synch link DRAM, SLDRAM direct memory bus random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the processor 1010 may include one or more processing units; optionally, the processor 1010 integrates an application processor and a modem processor, where the application processor mainly handles operations related to the operating system, user interface, application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the above modem processor may not be integrated into the processor 1010.
  • Embodiments of the present application also provide a readable storage medium.
  • Programs or instructions are stored on the readable storage medium.
  • the program or instructions are executed by a processor, each process of the above image processing method embodiment is implemented and the same can be achieved. The technical effects will not be repeated here to avoid repetition.
  • the processor is the processor in the electronic device described in the above embodiment.
  • the readable storage medium includes computer readable storage media, such as computer read-only memory ROM, random access memory RAM, magnetic disk or optical disk, etc.
  • An embodiment of the present application further provides a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the above image processing method embodiments. Each process can achieve the same technical effect. To avoid duplication, it will not be described again here.
  • chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-a-chip or system-on-chip, etc.
  • Embodiments of the present application provide a computer program product.
  • the program product is stored in a storage medium.
  • the program product is executed by at least one processor to implement each process of the above image processing method embodiment, and can achieve the same technical effect. , to avoid repetition, will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)

Abstract

La présente demande relève du domaine technique de l'intelligence artificielle. L'invention concerne un procédé et un appareil de génération d'images, ainsi qu'un dispositif électronique et un support de stockage lisible. Le procédé comprend: l'acquisition d'une première carte de luminosité correspondant à une première image et une première carte d'aberration chromatique correspondant à la première image; l'acquisition de première information de caractéristiques de contenu de texte dans la première carte de luminosité, et l'ajustement de la définition du contenu de texte dans la première carte de luminosité selon la première information de caractéristiques pour obtenir une première carte de luminosité ajustée; et la génération d'une seconde image selon la première carte d'aberration chromatique et la première carte de luminosité ajustée.
PCT/CN2023/109785 2022-08-03 2023-07-28 Procédé et appareil de traitement d'images, et dispositif électronique ainsi que support de stockage lisible WO2024027583A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210927005.3 2022-08-03
CN202210927005.3A CN115294055A (zh) 2022-08-03 2022-08-03 图像处理方法、装置、电子设备和可读存储介质

Publications (1)

Publication Number Publication Date
WO2024027583A1 true WO2024027583A1 (fr) 2024-02-08

Family

ID=83825809

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/109785 WO2024027583A1 (fr) 2022-08-03 2023-07-28 Procédé et appareil de traitement d'images, et dispositif électronique ainsi que support de stockage lisible

Country Status (2)

Country Link
CN (1) CN115294055A (fr)
WO (1) WO2024027583A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294055A (zh) * 2022-08-03 2022-11-04 维沃移动通信有限公司 图像处理方法、装置、电子设备和可读存储介质
CN117152022A (zh) * 2023-10-25 2023-12-01 荣耀终端有限公司 图像处理方法及电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (zh) * 2011-09-07 2011-12-21 谭洪舟 基于像素邻域特征的自适应文字图像的二值化处理方法
US20130322746A1 (en) * 2012-05-31 2013-12-05 Apple Inc. Systems and methods for ycc image processing
CN109035175A (zh) * 2018-08-22 2018-12-18 深圳市联合视觉创新科技有限公司 基于色彩纠正与脉冲耦合神经网络的人脸图像增强方法
CN109712097A (zh) * 2019-01-04 2019-05-03 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备
CN112102204A (zh) * 2020-09-27 2020-12-18 苏州科达科技股份有限公司 图像增强方法、装置及电子设备
CN112330574A (zh) * 2020-11-30 2021-02-05 深圳市慧鲤科技有限公司 人像修复方法、装置、电子设备及计算机存储介质
CN115294055A (zh) * 2022-08-03 2022-11-04 维沃移动通信有限公司 图像处理方法、装置、电子设备和可读存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289668A (zh) * 2011-09-07 2011-12-21 谭洪舟 基于像素邻域特征的自适应文字图像的二值化处理方法
US20130322746A1 (en) * 2012-05-31 2013-12-05 Apple Inc. Systems and methods for ycc image processing
CN109035175A (zh) * 2018-08-22 2018-12-18 深圳市联合视觉创新科技有限公司 基于色彩纠正与脉冲耦合神经网络的人脸图像增强方法
CN109712097A (zh) * 2019-01-04 2019-05-03 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备
CN112102204A (zh) * 2020-09-27 2020-12-18 苏州科达科技股份有限公司 图像增强方法、装置及电子设备
CN112330574A (zh) * 2020-11-30 2021-02-05 深圳市慧鲤科技有限公司 人像修复方法、装置、电子设备及计算机存储介质
CN115294055A (zh) * 2022-08-03 2022-11-04 维沃移动通信有限公司 图像处理方法、装置、电子设备和可读存储介质

Also Published As

Publication number Publication date
CN115294055A (zh) 2022-11-04

Similar Documents

Publication Publication Date Title
WO2024027583A1 (fr) Procédé et appareil de traitement d'images, et dispositif électronique ainsi que support de stockage lisible
US20200258197A1 (en) Method for generating high-resolution picture, computer device, and storage medium
WO2020192483A1 (fr) Procédé et dispositif d'affichage d'image
WO2021208600A1 (fr) Procédé de traitement d'image, dispositif intelligent, et support de stockage lisible par ordinateur
CN112529784B (zh) 图像畸变校正方法及装置
WO2021189733A1 (fr) Procédé et appareil de traitement d'image, dispositif électronique et support de stockage
KR20200014842A (ko) 이미지 조명 방법, 장치, 전자 기기 및 저장 매체
WO2023151511A1 (fr) Procédé et appareil d'apprentissage de modèle, procédé et appareil d'élimination de moiré d'image, et dispositif électronique
WO2012068902A1 (fr) Procédé et système d'amélioration de la netteté d'une image
WO2022161260A1 (fr) Procédé et appareil de focalisation, dispositif électronique et support
KR20210110679A (ko) 이미지 융합 프로세싱 모듈
US20240046538A1 (en) Method for generating face shape adjustment image, model training method, apparatus and device
WO2018113224A1 (fr) Procédé et dispositif de réduction d'image
JP2022544665A (ja) 画像処理方法、機器、非一時的コンピュータ可読媒体
WO2022179087A1 (fr) Procédé et appareil de traitement de vidéo
US20230353864A1 (en) Photographing method and apparatus for intelligent framing recommendation
CN112927241A (zh) 图片截取和缩略图生成方法、系统、设备及储存介质
CN114785957A (zh) 拍摄方法及其装置
CN114390206A (zh) 拍摄方法、装置和电子设备
CN110009563B (zh) 图像处理方法和装置、电子设备及存储介质
US9807315B1 (en) Lookup table interpolation in a film emulation camera system
CN116453131B (zh) 文档图像矫正方法、电子设备及存储介质
CN115883986A (zh) 图像处理方法及其装置
CN115272128A (zh) 图像处理方法及其装置
CN116261043A (zh) 对焦距离确定方法、装置、电子设备和可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23849291

Country of ref document: EP

Kind code of ref document: A1