WO2022021695A1 - 图像处理方法、用于图像处理的指令的生成方法及装置 - Google Patents

图像处理方法、用于图像处理的指令的生成方法及装置 Download PDF

Info

Publication number
WO2022021695A1
WO2022021695A1 PCT/CN2020/131057 CN2020131057W WO2022021695A1 WO 2022021695 A1 WO2022021695 A1 WO 2022021695A1 CN 2020131057 W CN2020131057 W CN 2020131057W WO 2022021695 A1 WO2022021695 A1 WO 2022021695A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
size
split
roi
splitting
Prior art date
Application number
PCT/CN2020/131057
Other languages
English (en)
French (fr)
Inventor
申俊志
王振江
李建军
Original Assignee
地平线(上海)人工智能技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 地平线(上海)人工智能技术有限公司 filed Critical 地平线(上海)人工智能技术有限公司
Priority to EP20947620.9A priority Critical patent/EP4020316A4/en
Priority to JP2022520212A priority patent/JP7369288B2/ja
Priority to US17/764,409 priority patent/US20220351329A1/en
Publication of WO2022021695A1 publication Critical patent/WO2022021695A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/2163Partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image processing method, a method and apparatus for generating instructions for image processing.
  • a region of interest (ROI) image for a specific task will be generated, and then the ROI image can be scaled by the image scaling module, and the scaled image is obtained.
  • the image is input to the image processing model for processing.
  • the image scaling module does not perform scaling processing for some ROI images, but only outputs an error prompt, which will cause the subsequent processes involved in the visual image processing technology to fail to execute normally.
  • Embodiments of the present disclosure provide an image processing method, a method and apparatus for generating an instruction for image processing.
  • an image processing method including:
  • the plurality of scaled image blocks are sequentially input to the image processing model.
  • a method for generating an instruction for image processing including:
  • the hardware output size of the image scaling module does not match the first image size supported by the image processing model, based on the hardware output size and the first image size, determine whether to use the template image with the first image size
  • the splitting mode information is used to split the template image into multiple image blocks, and the image size of each image block in the multiple image blocks is the same as that of the The hardware output size matches;
  • an instruction for image processing is generated, and the instruction for image processing is used to execute the above-mentioned image processing method.
  • an image processing apparatus including:
  • a splitting module is used to split to obtain a plurality of image blocks based on the first image size supported by the image processing model, the first split data, and the obtained ROI image when the ROI image of the region of interest is obtained;
  • Each of the multiple image sizes obtained by splitting the first image size based on the first split data matches the hardware output size of the image scaling module;
  • an image scaling module for performing image scaling on each of the multiple image blocks obtained by the splitting module, respectively, to obtain multiple scaling image blocks;
  • the multiple scaling image blocks The multiple image sizes of , and the multiple image sizes obtained by splitting the first image size based on the first split data are consistent in one-to-one correspondence;
  • the input module is used for sequentially inputting the plurality of scaled image blocks obtained by the image scaling module to the image processing model.
  • an apparatus for generating an instruction for image processing including:
  • a determining module configured to determine a pair with the first image size based on the hardware output size and the first image size in the case that the hardware output size of the image scaling module does not match the first image size supported by the image processing model
  • the splitting mode information of the template image of the image size; the splitting mode information is used to split the template image into multiple image blocks, and the image of each image block in the multiple image blocks
  • the dimensions all match the hardware output dimensions
  • an acquisition module configured to acquire first split data based on the split mode information determined by the determination module
  • a generating module configured to generate an instruction for image processing based on the first split data obtained by the acquiring module, where the instruction for image processing is used to execute the above image processing method.
  • a computer-readable storage medium stores a computer program, and the computer program is used for executing the above image processing method, or for executing the above image processing method. command generation method.
  • an electronic device comprising:
  • a memory for storing the processor-executable instructions
  • the processor is configured to read the executable instructions from the memory and execute the instructions to implement the above-mentioned image processing method, or to execute the above-mentioned method for generating instructions for image processing.
  • an instruction for image processing can be obtained based on the first method supported by the image processing model when the ROI image is obtained.
  • the image size, the first split data, and the obtained ROI image are split to obtain a plurality of image blocks.
  • image scaling may be performed on each of the plurality of image blocks to obtain a plurality of scaled image blocks.
  • each scaled image block matches the hardware output size, and each scaled image block can be normally output from the image scaling module, and then multiple scaled image blocks can be normally input to the image processing model in sequence .
  • FIG. 1 is a general schematic diagram of an embodiment of the present disclosure.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an exemplary embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of a disassembly of a running phase in an embodiment of the present disclosure.
  • FIG. 4 is a schematic flowchart of an image processing method provided by another exemplary embodiment of the present disclosure.
  • FIG. 5A is a working flowchart of a system for implementing a visual image processing technology in the related art.
  • FIG. 5B is a work flow diagram of a system for implementing visual image processing techniques in an embodiment of the present disclosure.
  • FIG. 6 is a schematic flowchart of a method for generating an instruction for image processing provided by an exemplary embodiment of the present disclosure.
  • FIG. 7 is a schematic flowchart of a method for generating an instruction for image processing provided by another exemplary embodiment of the present disclosure.
  • FIG 8 is another general schematic diagram of an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus provided by an exemplary embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of an image processing apparatus provided by another exemplary embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of an apparatus for generating an instruction for image processing provided by an exemplary embodiment of the present disclosure.
  • FIG. 12 is a schematic structural diagram of an apparatus for generating an instruction for image processing provided by another exemplary embodiment of the present disclosure.
  • FIG. 13 is a structural diagram of an electronic device provided by an exemplary embodiment of the present disclosure.
  • Embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general-purpose or special-purpose computing system environments or configurations.
  • Examples of well-known terminal equipment, computing systems, environments and/or configurations suitable for use with terminal equipment, computer systems, servers, etc. electronic equipment include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients computer, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the foregoing, among others.
  • Electronic devices such as terminal devices, computer systems, servers, etc., may be described in the general context of computer system-executable instructions, such as program modules, being executed by the computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • Computer systems/servers may be implemented in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located on local or remote computing system storage media including storage devices.
  • the visual image processing technology can be implemented by an artificial intelligence (Artificial Intelligence, AI) visual image processing system, and the operations performed by the system can include: image acquisition, image signal processing (Image Signal Processing, ISP), image pyramid generation, full image detection, ROI processing, post-processing, output results, etc.
  • AI Artificial Intelligence
  • ROI images for specific tasks will be generated, such as ROI images for face detection tasks (which are the face images detected from the full map), ROI images for pedestrian detection tasks, etc.
  • the ROI image generated by the whole image detection needs to be provided to the image processing model, so that the ROI processing is performed by the image processing model; wherein, the image processing model may be a convolutional neural network model.
  • the image processing model Since the ROI image generated by the full-image detection is generated when the system is running, there is no way to predict the position and size in advance, and due to the setting of the model designer, the image processing model generally has requirements for the input image.
  • the input image has size requirements, therefore, it is necessary to scale the ROI image generated by full-image detection to the same size.
  • the scaling of the ROI image may be implemented by a hardware module, which may be called an image scaling module or a Resizer module.
  • the input size of the image scaling module is the size of the ROI image generated by full-image detection
  • the output size of the image scaling module is consistent with the size of the input image required by the image processing model.
  • the embodiments of the present disclosure may mainly include two stages, namely, a compilation stage and a running stage; wherein, in the compilation stage, the method for generating instructions for image processing provided by the embodiments of the present disclosure may be executed. , to generate instructions for executing the image processing methods provided by the embodiments of the present disclosure; in the running phase, for any ROI image obtained through full-image detection, the instructions generated in the compilation phase can be executed based on the instructions provided by the embodiments of the present disclosure.
  • the image processing method is used to solve the problem that the subsequent processes involved in the visual image processing technology cannot be executed normally due to the limited output of the image scaling module.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an exemplary embodiment of the present disclosure.
  • the method shown in FIG. 2 includes step 201 , step 202 and step 203 , and each step will be described below.
  • Step 201 under the situation of obtaining the ROI image, based on the first image size supported by the image processing model, the first split data, and the obtained ROI image, split to obtain a plurality of image blocks; the first image size is based on the first image block.
  • the multiple image sizes obtained by splitting the split data matches the hardware output size of the image scaling module.
  • the ROI image may be obtained through full-image detection first. Specifically, the entire image shown in the left part in FIG. 3 can be detected to obtain an image with a height dimension h in FIG. 3 .
  • the first image size supported by the image processing model and the first split data determined in the compilation stage can be obtained.
  • the first image size is based on the multiple image sizes obtained by splitting the first split data. Each image size in matches the hardware output size of the image scaling module.
  • the first image size supported by the image processing model can represent the size requirements of the image processing model for the input image, and the first image size supported by the image processing model can be 244 ⁇ 244;
  • the hardware output size of the image scaling module can be Indicates the maximum image size that the image scaling module can output.
  • the hardware output size of the image scaling module can be 128 ⁇ 72; any image size that matches the hardware output size of the image scaling module can mean that the image size is smaller than the image scaling module’s hardware output size.
  • the hardware output size for example, the hardware output size of the image scaling module is 128 ⁇ 72, and the image size is 76 ⁇ 76.
  • the first image size is the image size of the image whose height size is H as shown in the right part of FIG. 3
  • the first split data can be used to split the first image size into 6 image sizes , are the image size of ROI 0 '', the image size of ROI 1 '', the image size of ROI 2 '', the image size of ROI 3 '', the image size of ROI 4 '', the image size of ROI 5 '', the image size of ROI 5 '' .
  • the hardware output size of the image scaling module is 128 ⁇ 72
  • the image sizes from ROI 0 '' to ROI 5 '' need to be smaller than 128 ⁇ 72.
  • the splitting object and how to split the splitting object can be determined based on the first image size, the first splitting data, and the obtained ROI image, and according to the This does the actual splitting, resulting in multiple image blocks. If the multiple image sizes obtained by splitting the first image size based on the first split data are N image sizes, the multiple image blocks obtained through actual splitting may be N image blocks, N image sizes There can be one-to-one correspondence with N image blocks.
  • Step 202 carry out image scaling to each image block in a plurality of image blocks respectively, to obtain a plurality of scaled image blocks; a plurality of image sizes of a plurality of scaled image blocks, and the first image size is based on the first
  • the sizes of multiple images obtained by splitting the data are consistent in one-to-one correspondence.
  • the image scaling module can be called to perform image scaling on each image block in the plurality of image blocks, and the image size of the scaled image block obtained after scaling any image block needs to be ensured , and among the plurality of image sizes obtained by splitting the first image size based on the first split data, the image size corresponding to the image block is consistent.
  • the first split data is used to split the first image size into 6 image sizes, and the 6 image sizes are respectively the image sizes from ROI 0 '' to ROI 5 '', in step 201
  • the multiple image blocks obtained by splitting are ROI 00 ' ⁇ ' to ROI0 5 ''.
  • ROI 00 '' can be scaled to the image size of ROI 0 ''
  • ROI 01 '' can be scaled to ROI 1 '' image size
  • scale ROI 02 '' to the image size of ROI 2 ''
  • scale ROI 03 '' to the image size of ROI 3 ''
  • scale ROI 04 '' to the image size of ROI 4 '' size
  • scale ROI 05 '' to the image size of ROI 5 '' to get the corresponding scaled image blocks from ROI 00 '' to ROI0 5 '', so that a total of 6 scaled image blocks can be obtained.
  • Step 203 sequentially inputting a plurality of scaled image blocks into the image processing model.
  • a plurality of scaled image blocks can be sequentially input into the image processing model, so that the image processing model can process each scaled image block, such as face detection processing or pedestrian detection processing, etc., to The processing result of each scaled image block is obtained, and the processing result of each scaled image block can form a final processing result. Since the multiple image sizes of the multiple zoomed image blocks and the multiple image sizes obtained by splitting the first image size based on the first split data are consistent in one-to-one correspondence, the final processing result is equivalent to the image processing model pair The result obtained after processing an image of a first image size, the final processing result can then be used for other processes involved in the visual image processing technology.
  • a plurality of image blocks may be obtained by splitting based on the first image size supported by the image processing model, the first split data, and the obtained ROI image. Afterwards, image scaling may be performed on each of the plurality of image blocks to obtain a plurality of scaled image blocks. Since the multiple image sizes of the multiple scaled image blocks and the multiple image sizes obtained by splitting the first image size based on the first split data are consistent in one-to-one correspondence, then, the multiple scaled image blocks in the The image size of each scaled image block matches the hardware output size, and each scaled image block can be normally output from the image scaling module, and then multiple scaled image blocks can be normally input to the image processing model in sequence .
  • step 201 includes step 2011 , step 2022 and step 2023 .
  • Step 2011 in the case that the obtained ROI image satisfies the specified image alignment condition, the obtained ROI image is used as the ROI image to be split; otherwise, image adjustment is performed based on the obtained ROI image to obtain the specified image alignment condition.
  • the ROI image to be split in the case that the obtained ROI image satisfies the specified image alignment condition, the obtained ROI image is used as the ROI image to be split; otherwise, image adjustment is performed based on the obtained ROI image to obtain the specified image alignment condition.
  • the ROI image to be split in the case that the obtained ROI image satisfies the specified image alignment condition, the obtained ROI image is used as the ROI image to be split; otherwise, image adjustment is performed based on the obtained ROI image to obtain the specified image alignment condition.
  • the ROI image to be split in the case that the obtained ROI image satisfies the specified image alignment condition, the obtained ROI image is used as the ROI image to be split; otherwise, image adjustment is performed based on the obtained ROI image to obtain the specified image alignment condition.
  • the ROI image to be split in the case that
  • the ROI image After the ROI image is obtained, it can be determined whether the obtained ROI image satisfies the specified image alignment condition, and the specific determination method is described below with an example.
  • the method further includes:
  • the first target value is determined based on the divisor of the value of the size of the preset direction specified by the image scaling module, and the value of the size of the preset direction in the first image size;
  • the first target value may be a value determined in the compilation stage based on the divisor of the size of the preset direction specified by the image scaling module, and the value of the size of the preset direction in the first image size,
  • This value is used to represent a user alignment requirement (for convenience of distinction, it will be referred to as the first user alignment requirement hereinafter, and the first user alignment requirement may be considered as a size alignment requirement), and here, the user alignment may also be referred to as user_alignment.
  • the specific determination process of the first target value please refer to the description of the corresponding part of the compilation stage below, and will not be expanded here.
  • the preset direction may include at least one of the width direction and the height direction. Since the operation and processing procedures for the width direction and the height direction may be similar, in the embodiments of the present disclosure, only the preset direction is the height direction. Taking an example to illustrate, at this time, the size of the preset direction is specifically the height size.
  • a runtime application program interface (Runtime Application Programming Interface, Runtime API) can be called to obtain the first target value determined in the compilation stage, and it is judged whether the height dimension of the obtained ROI image is equal to the first target value. Integer multiples to obtain the first judgment result. If the first judgment result indicates that the height size of the obtained ROI image is an integer multiple of the first target value, it can be considered that the obtained ROI image meets the first user alignment requirement, and the obtained ROI image can be If it is correctly split, it can be determined that the obtained ROI image satisfies the specified image alignment condition; otherwise, it can be determined that the obtained ROI image does not meet the specified image alignment condition.
  • Runtime API Runtime Application Programming Interface
  • the method further includes:
  • the coordinate attribute specified by the image scaling module may represent another user alignment requirement (for convenience of distinction, it is hereinafter referred to as the second user alignment requirement, and the second user alignment requirement may be regarded as a coordinate alignment requirement).
  • the coordinate attribute specified by the image scaling module may be: the coordinate of the preset position is an even number; wherein, the preset position may be the upper left corner position, the upper right corner position or other positions.
  • the second judgment result indicates that the coordinates of the preset position have the coordinate attribute specified by the image scaling module, it can be considered that the obtained ROI image satisfies the second user alignment requirement, and the obtained ROI image can be normally processed by the image scaling module, so it can be determined The obtained ROI image satisfies the specified image alignment condition; otherwise, it can be determined that the obtained ROI image does not meet the specified image alignment condition.
  • the above two implementations of determining whether the obtained ROI image meets the specified image alignment conditions can also be combined with each other.
  • the height dimension of the obtained ROI image can be an integer multiple of the first target value
  • the obtained ROI image can be directly used as the ROI image to be split; if the determination result is no, the obtained ROI image can be On this basis, image adjustment is performed to obtain the ROI image to be split that satisfies the specified image alignment conditions.
  • a new ROI image can be taken on the whole image as the image to be split according to the position of the obtained ROI image on the whole image, or the obtained ROI image can be slightly cropped, The cropped ROI image is used as the ROI image to be split.
  • Step 2012 Determine second split data based on the first image size, the first split data, and the second image size of the ROI image to be split.
  • step 2012 includes:
  • the second split data is determined based on each split position in the preset direction of the ROI image to be split.
  • the first split data may include coordinate information of each split position in the height direction; and/or, the first split data may include the proportional relationship of each size segment split in the height direction (for example, 3:3 : 2, 3: 4: 3, etc.). In this way, based on the first split data, each split position in the height direction of the template image having the first image size can be located conveniently and reliably.
  • the proportional relationship between the height size of the ROI image to be split and the height size in the first image size can also be determined. Assuming that the ROI image to be split is the image shown in the left part of FIG. 3, the height size is h, the template image with the first image size is the image shown in the right part in FIG. 3, the height size is H, then determine The resulting proportional relationship is h/H.
  • each splitting position in the height direction of the template image and the proportional relationship between the height size of the ROI image to be split and the height size in the first image size are determined, the Each splitting position in the height direction is mapped to the ROI image to be split, so as to obtain each mapping position in the height direction of the ROI image to be split, and each obtained mapping position can be used as the ROI image to be split.
  • a split position in the height direction is mapped to the ROI image to be split, so as to obtain each mapping position in the height direction of the ROI image to be split, and each obtained mapping position can be used as the ROI image to be split.
  • position P11 can be mapped to the ROI image to be split, and the corresponding mapping position It can be the position P21, and the position P12 can also be mapped to the ROI image to be split, and the corresponding mapping position can be the position P22.
  • the second split data can be determined accordingly.
  • the second split data may include the split position coordinate information of each split position in the height direction of the ROI image to be split; and/or, the second split data may be included in the ROI image to be split.
  • the proportional relationship of each size segment split in the height direction such as h 1 : h 2 : h-h 1 -h 2 .
  • the split position on the template image with the first image size can be accurately and reliably located, and then combined with the proportional relationship between the ROI image to be split and the first image size, The split position on the ROI image to be split can be accurately and reliably located, so as to obtain the second split data accordingly.
  • Step 2013, according to the second split data, split the ROI image to be split to obtain a plurality of image blocks.
  • each split position in the height direction of the ROI image to be split is closely related to the second split data, according to the second split data, it can be split at each split position in the height direction, In a similar manner, it is also possible to perform splitting in the width direction, so as to obtain multiple image blocks based on the splitting in the height direction and the width direction.
  • a to-be-split ROI image that satisfies the specified image alignment condition can be obtained, and then the second split data for indicating how to split can be determined, and then the second split data can be determined according to the second split data.
  • the workflow of the AI visual image processing system can be as follows: image acquisition, ISP, image pyramid generation and full image detection are sequentially performed to generate ROI images for specific tasks. ; Next, before providing the ROI image generated by the full-image detection to the image scaling module, first predict whether the ROI image generated by the full-image detection is scaled by the image scaling module, whether the obtained zoomed image (assuming it is represented as ROI') is not If the limit is met, the ROI image generated by the full-image detection will be sent to the image scaling module to perform the scaling operation to obtain the ROI', and the image processing model will process the ROI', and then the post-processing and output results will also be performed.
  • the workflow of the AI visual image processing system may be somewhat different, specifically: if the predicted ROI' does not meet the limit, an alignment process (with meet the above-mentioned user alignment requirements) and split processing (for example, split into several parts of the size that the image scaling module can output), in this way, the embodiments of the present disclosure can ensure visual image processing when the output of the image scaling module is limited. The normal execution of subsequent processes involved in the technology.
  • any of the image processing methods provided by the embodiments of the present disclosure may be executed by any appropriate device with data processing capabilities, including but not limited to: terminal devices and servers.
  • any of the image processing methods provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor executes any of the image processing methods mentioned in the embodiments of the present disclosure by invoking corresponding instructions stored in the memory. No further description will be given below.
  • FIG. 6 is a schematic flowchart of a method for generating an instruction for image processing provided by an exemplary embodiment of the present disclosure.
  • the method shown in FIG. 6 includes step 601 , step 602 and step 603 , and each step will be described separately below.
  • Step 601 in the case where the hardware output size of the image scaling module does not match the first image size supported by the image processing model, based on the hardware output size and the first image size, determine the splitting of the template image with the first image size Mode information; the split mode information is used to split the template image into multiple image blocks, and the image size of each image block in the multiple image blocks matches the hardware output size.
  • the hardware output size of the image scaling module can represent the maximum image size that the image scaling module can output;
  • the first image size supported by the image processing model can represent the size requirement of the image processing model for the input image;
  • the mismatch between the hardware output size and the first image size supported by the image processing model may mean that the hardware output size of the image scaling module is smaller than the first image size supported by the image processing model, for example, the hardware output size of the image scaling module is 128 ⁇ 72 , and the first image size supported by the image processing model is 244 ⁇ 244; any image size matching the hardware output size of the image scaling module may mean that the image size is smaller than the hardware output size of the image scaling module.
  • the hardware output size of the image scaling module does not match the first image size supported by the image processing model, it may be determined, based on the hardware output size and the first image size, how to adjust the template image of the first image size (for example, in FIG. 3 ).
  • the image shown in the right part, the height size is H) is split, and it can be guaranteed that the image size of each image block in the multiple image blocks obtained by splitting all matches the hardware output size of the image scaling module, Thereby, the corresponding splitting method information is obtained.
  • the splitting method information may include each splitting position coordinate information in the width direction and each splitting position coordinate information in the height direction; and/or, the splitting method information may include splitting in the width direction.
  • the split method information in addition to referring to the hardware output size of the image scaling module and the first image size supported by the image processing model, other factors can also be referred to, such as referring to the specific instruction type of the template image, The storage location of the template image in memory, the allocation and management rules of the compiler at runtime, etc.
  • Step 602 Obtain first split data based on the split method information.
  • the data including all the information in the splitting method information can be directly used as the first splitting data; or, part of the information can be extracted from the splitting method information, and the data including the extracted partial information can be taken as the first splitting data
  • the splitting method information includes both the splitting position coordinate information and the proportional relationship of the size segment
  • the splitting position coordinate information can be extracted from the splitting method information, and the extracted splitting position coordinate information can be included.
  • the data of the position coordinate information is used as the first split data.
  • Step 603 based on the first split data, generate an instruction for image processing, and the instruction for image processing is used to execute an image processing method (specifically, the image processing method disclosed in the above embodiment).
  • steps 601 to 603 may all be executed by a compiler.
  • the hardware output size of the image scaling module in the case where the hardware output size of the image scaling module does not match the first image size supported by the image processing model, it may be determined based on the hardware output size and the first image size for the image scaling module with the first image size.
  • first splitting data is obtained based on the splitting method information, and then, based on the first splitting data, an instruction for image processing can be generated, and the instruction can be used to execute the above image processing method.
  • the first split data can be utilized, and the image splitting operation and the image scaling operation can be combined to ensure the normal execution of the subsequent processes involved in the visual image processing technology.
  • step 601 includes step 6011 , step 6012 and step 6013 .
  • Step 6011 when the image scaling module specifies the divisor of the size of the preset direction, calculate a first product; the first product is the product of the divisor and the numerical value of the size in the preset direction in the first image size.
  • the preset direction may include at least one of the width direction and the height direction. Since the operation and processing procedures for the width direction and the height direction may be similar, in the embodiments of the present disclosure, only the preset direction is the height direction. Taking an example to illustrate, at this time, the size of the preset direction is specifically the height size.
  • the divisor of the height size specified by the image scaling module can be represented by c
  • the value of the first image size can be represented by H in FIG. 3
  • the first product can be represented as cH; where c can be considered as the image scaling module hardware alignment requirements.
  • Step 6012 based on the first product, determine each split position on the template image with the first image size.
  • Step 6012 including:
  • the first target value and the second target value are determined; the product of the first target value and the second target value and the first product satisfy a preset relationship;
  • the method also includes:
  • any product and the first product satisfy the preset relationship may mean that any product is equal to the first product.
  • the first target data can be recorded, so that the recorded first target value can be obtained during the running phase, so that the first target value can be obtained based on the first target product.
  • the target data is used to determine the specified image alignment conditions.
  • each splitting position on the template image with the first image size can be determined based on the second target value, and it is necessary to ensure that each splitting position divided by the determined splitting positions in the height direction of the template image is determined.
  • the values of the size of the size segment are all integer multiples of the second target value. For example, for FIG. 3 , it is necessary to ensure that H 1 , H 2 , and H-H 1 -H 2 are all integer multiples of the second target value.
  • the image scaling module proportionally scales the image with the height size h as shown in the left part.
  • A can be considered as the size alignment value of the compiler split point (which corresponds to the split position), and a can be considered as the ROI image provided to the image scaling module (for example, the ROI image to be split above)
  • a can be considered as the ROI image provided to the image scaling module (for example, the ROI image to be split above)
  • a can be used as the first target numerical value, and A can be used as the second target numerical value, to ensure that in the running phase, the ROI image to be split can be correctly split, and each image block obtained by splitting can meet the size.
  • Alignment requirements that is, the height size of each image block obtained by splitting is an integer multiple of a which is the first target value.
  • Step 6013 Determine, based on each of the determined splitting positions, information on the splitting manner of the template image having the first image size.
  • each splitting position is determined, the corresponding splitting position coordinate information can be obtained, so as to obtain the splitting method information including the obtained splitting position coordinate information.
  • the proportional relationship of each size segment can also be obtained, so as to obtain Including information on the splitting method of the obtained proportional relationship.
  • the image scaling module when the image scaling module specifies the divisor of the size of the preset direction, the calculation of the first product (ie, cH in the above) can be performed, and the first product is used as guiding information , and determine the appropriate first target value and second target value to ensure that image splitting can be performed correctly in the running phase, and that the splitting result meets the size alignment requirements.
  • the first product ie, cH in the above
  • determining the first target value and the second target value based on the first product includes:
  • each reference parameter group in the at least one reference parameter group includes two values, and the product of the two values in any reference parameter group and the first product satisfies a predetermined establish relationship;
  • One value in the first reference parameter group is taken as the first target value, and another value in the first reference parameter group is taken as the second target value.
  • At least one reference parameter group may be determined first, each reference parameter group includes two values, and the product of the two values in any reference parameter group is equal to the first product.
  • the at least one reference parameter group may be obtained by factoring the first product.
  • a first reference parameter group may be selected from at least one reference parameter group.
  • one reference parameter group may be selected from at least one reference parameter group as the first reference parameter group according to the setting rule, and at this time, the first reference parameter group selected from the at least one reference parameter group may include:
  • the reference parameter group with the smallest corresponding total alignment cost is selected as the first reference parameter group.
  • the total alignment cost value of each reference parameter group may be calculated separately, and the total alignment cost value may include the alignment cost value of the compiler and the alignment cost value of the user.
  • the reference parameter group with the smallest corresponding total alignment cost value can be filtered out, and then the filtered reference parameter group can be used as the first reference parameter group.
  • the cost can be minimized under the condition that both the compiler and the user meet the corresponding alignment requirements.
  • one reference parameter group may be randomly selected from at least one reference parameter group as the first reference parameter group.
  • the first target value and the second target value can be conveniently determined. Since the first reference parameter group is selected from at least one reference parameter group, and at least one reference parameter group is determined based on the product, it can ensure that both the compiler and the user meet the corresponding alignment requirements.
  • the total alignment cost value of each reference parameter group in the at least one reference parameter group is calculated separately, including:
  • the larger value of the two values included in the reference parameter group may be determined first; then, you can Using A', determine each split position in the height direction of the template image.
  • the value of the size of each size segment divided by the determined split positions is Integer multiple of A'; after that, the alignment cost value of each split position in the height direction of the template image can be calculated.
  • the alignment cost value of each split position is calculated, including:
  • the remainder result and the difference are weighted and summed to obtain the alignment cost value of any split position.
  • the preset value may be 1, 2, 3 or other values, and the preset adjustment factor may be expressed as p.
  • any reference parameter group as the reference parameter group S, and the larger value of the two values included in any reference parameter group is A' as an example, assuming that the height direction of the template image determined by A', In the coordinates of a split position, the coordinate in the height direction is represented as y, then the remainder result of y and A' can be calculated, and the remainder result can be represented as roi.y%A'.
  • the ratio of the first product cH to A' can also be calculated, and the difference between the ratio and the preset value (assuming it is 1) can be calculated, and the difference can be expressed as After that, based on the preset adjustment factor p, the remainder result roi.y%A' and the difference Perform weighted summation, specifically, in the weighted summation, p can be used as the weight of the remainder result roi.y%A', and p-1 can be used as the difference to obtain the alignment cost value of a split position, the alignment cost value of the split position can be expressed as f(A′), and the alignment cost value of the split position can be calculated using the following cost function:
  • the operand before "+” is the alignment cost value of the compiler
  • the operand after "+” is the user's alignment cost value in the theoretical worst case.
  • the alignment cost value of each split position in the G split positions can be calculated according to the above method, so as to obtain G split positions.
  • the sum of the G alignment cost values can be calculated, and the sum can be used as the total alignment cost value of the reference parameter group S. It should be pointed out that the total alignment cost value of each reference parameter group can be compared later. If the total alignment cost value of the reference parameter group S is the smallest, the reference parameter group S can be used as the first reference parameter group.
  • A' can be used as the second target value or A in the above, Can be used as the second target value or a above.
  • the total alignment cost value of each reference parameter group can be obtained conveniently and reliably, thereby facilitating the determination of a reasonable first target value and a second target value.
  • the first image size supported by the image processing model is 66 ⁇ 68, that is, the width size is 66 and the height size is 68.
  • the image scaling module requires that the upper left corner coordinate, height size and width size of the input image are all If the number is even, when splitting, the size alignment value A of the compiler split point and the size alignment value a of the ROI image provided to the image scaling module should satisfy:
  • A may be specifically 4, that is, the compiler satisfies 4 alignment, and the user satisfies 33 alignment.
  • the alignment requirements that the image scaling module needs to meet (for example, the upper left corner coordinates, the requirement that the height and width dimensions are even numbers, etc.)
  • the hardware output size and the first image size supported by the image processing model are provided to the compiler, and the compiler calculates the alignment A that the compiler needs to meet, and calculates the alignment a that the user needs to meet, records a, and performs the first image size.
  • the user can obtain a through the Runtime API, and then perform the corresponding alignment processing to obtain the ROI image to be split and perform subsequent processing, so as to ensure the correct and effective splitting of the ROI image to be split.
  • the compiler can calculate how to meet the hardware constraints at a small cost by analyzing the hardware constraints of the image scaling module and the input requirements of the image processing model; the compiler can also give According to the split and alignment strategy, reasonable splitting and alignment can be carried out to ensure that the entire AI visual image processing system can operate correctly and efficiently when the hardware resources of the image scaling module are limited, so as to better ensure the user's Use experience.
  • Any method for generating an instruction for image processing provided by the embodiments of the present disclosure may be executed by any appropriate device with data processing capabilities, including but not limited to: terminal devices and servers.
  • any method for generating an instruction for image processing provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor executes any one of the methods mentioned in the embodiments of the present disclosure by invoking corresponding instructions stored in the memory.
  • a method of generating instructions for image processing No further description will be given below.
  • FIG. 9 is a schematic structural diagram of an image processing apparatus provided by an exemplary embodiment of the present disclosure.
  • the apparatus shown in FIG. 9 includes a splitting module 901 , an image scaling module 902 , and an input module 903 .
  • the splitting module 901 is used to split and obtain a plurality of image blocks based on the first image size supported by the image processing model, the first split data, and the obtained ROI image when the ROI image is obtained; the first Each image size in the multiple image sizes obtained by splitting the image size based on the first split data matches the hardware output size of the image scaling module 902;
  • the image scaling module 902 is used to perform image scaling on each image block in the multiple image blocks obtained by the splitting module 901 respectively, to obtain multiple scaled image blocks; multiple images of the multiple scaled image blocks size, and the multiple image sizes obtained by splitting the first image size based on the first split data are consistent in a one-to-one correspondence;
  • the input module 903 is used for sequentially inputting the multiple scaled image blocks obtained by the image scaling module 902 into the image processing model.
  • the splitting module 901 includes:
  • the first acquisition sub-module 9011 is used to use the obtained ROI image as the ROI image to be split under the condition that the obtained ROI image satisfies the specified image alignment condition; otherwise, perform image adjustment based on the obtained ROI image to Obtain the ROI image to be split that satisfies the specified image alignment condition;
  • the determination submodule 9012 is used to determine the second split data based on the first image size, the first split data, and the second image size of the ROI image to be split obtained by the first acquisition submodule 9011;
  • the second acquisition sub-module 9013 is configured to split the ROI image to be split obtained by the first acquisition sub-module 9011 according to the second split data determined by the determination sub-module 9012 to obtain a plurality of image blocks.
  • the determination sub-module 9012 includes:
  • a first determining unit for determining, based on the first splitting data, each splitting position in a preset direction of the template image having the first image size
  • a second determining unit configured to determine the size of the preset direction in the second image size of the ROI image to be split, and the proportional relationship of the size of the preset direction in the first image size
  • the third determination unit is configured to determine, based on the proportional relationship determined by the second determination unit and each split position in the preset direction of the template image determined by the first determination unit, the preset direction of the ROI image to be split each split position;
  • the fourth determination unit is configured to determine the second split data based on each split position in the preset direction of the ROI image to be split determined by the third determination unit.
  • the device also includes:
  • the first acquisition module 904 is used to acquire the first target value; the first target value is based on the divisor of the value of the size of the preset direction specified by the image scaling module 902, and the value of the size of the preset direction in the first image size Sure;
  • the second obtaining module 905 is used to judge whether the value of the size of the obtained ROI image in the preset direction is an integer multiple of the first target value obtained by the first obtaining module 904, so as to obtain the first judgment result;
  • the first determining module 906 is configured to determine whether the obtained ROI image satisfies the specified image alignment condition based on the first judgment result obtained by the second obtaining module 905;
  • the device also includes:
  • the third obtaining module 907 is used for obtaining the coordinates of the preset position on the obtained ROI image
  • the fourth obtaining module 908 is used to judge whether the coordinates of the preset position obtained by the third obtaining module 907 have the coordinate attribute specified by the image scaling module 902, so as to obtain the second judgment result;
  • the second determining module 909 is configured to determine whether the obtained ROI image satisfies the specified image alignment condition based on the second judgment result obtained by the fourth obtaining module 908 .
  • FIG. 11 is a schematic structural diagram of an apparatus for generating an instruction for image processing provided by an exemplary embodiment of the present disclosure.
  • the apparatus shown in FIG. 11 includes a determination module 1101 , an acquisition module 1102 , and a generation module 1103 .
  • the determining module 1101 is configured to, in the case where the hardware output size of the image scaling module does not match the first image size supported by the image processing model, based on the hardware output size and the first image size, determine whether to select a template image with the first image size
  • the splitting mode information is used to split the template image into a plurality of image blocks, and the image size of each image block in the multiple image blocks is matched with the hardware output size;
  • an obtaining module 1102 configured to obtain the first splitting data based on the splitting method information determined by the determining module 1101;
  • the generating module 1103 is configured to generate an instruction for image processing based on the first split data obtained by the acquiring module 1102, and the instruction for image processing is used to execute the above-mentioned image processing method.
  • the determining module 1101 includes:
  • the calculation sub-module 11011 is used to calculate the first product when the image scaling module specifies the divisor of the size of the preset direction; the first product is the divisor and the size of the preset direction in the first image size. product of values;
  • the first determination sub-module 11012 is configured to determine each split position on the template image with the first image size based on the first product calculated by the calculation sub-module 11011;
  • the second determination sub-module 11013 is configured to determine, based on each of the split positions determined by the first determination sub-module 11012, information on the split mode of the template image having the first image size.
  • the first determination sub-module 11012 includes:
  • the fifth determination unit is used to determine the first target value and the second target value based on the first product; the product of the first target value and the second target value and the first product satisfy a preset relationship;
  • the sixth determination unit is used to determine each split position on the template image with the first image size based on the second target value determined by the fifth determination unit;
  • the value of the size of each size segment divided by position is an integer multiple of the second target value;
  • the device also includes:
  • the recording module is used for recording the first target value determined by the fifth determining unit.
  • the fifth determining unit includes:
  • a first determination subunit configured to determine at least one reference parameter group based on the first product; each reference parameter group in the at least one reference parameter group includes two values, and two values in any reference parameter group The product of , and the first product satisfy the preset relationship;
  • a selection subunit for selecting a first reference parameter group from at least one reference parameter group determined by the first determination subunit
  • the second determination subunit is configured to use one value in the first reference parameter group selected by the selection subunit as the first target value, and use another value in the first reference parameter group selected by the selection subunit as the second value target value.
  • a subunit is selected, which is specifically configured to calculate the total alignment cost value of each reference parameter group in the at least one reference parameter group respectively; select the reference parameter group with the smallest corresponding total alignment cost value as the first reference parameter group.
  • the selection subunit is specifically configured to determine, based on the larger value of two values included in any reference parameter group in the at least one reference parameter group, each split in the preset direction of the template image divide the positions, and calculate the alignment cost value of each split position; calculate the sum of the obtained alignment cost values to obtain the total alignment cost value of any reference parameter group.
  • selecting a subunit is specifically used to obtain the coordinates of a preset direction among the coordinates of any split position; calculate the obtained coordinates and the larger of the two values included in any reference parameter group The remainder result of the numerical value; calculate the first product and the ratio of the larger of the two values included in any reference parameter group; calculate the difference between the ratio and the preset value; based on the preset adjustment factor, the remainder result is calculated A weighted summation of the difference is performed to obtain the alignment cost at any split position.
  • the electronic device may be either or both of the first device and the second device, or a stand-alone device independent of them that can communicate with the first device and the second device to receive the collected data from them input signal.
  • FIG. 13 illustrates a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device 1300 includes one or more processors 1301 and a memory 1302 .
  • Processor 1301 may be a central processing unit or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 1300 to perform desired functions.
  • Memory 1302 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 1301 may execute the program instructions to implement the image processing method in the various embodiments of the present disclosure described above or for A method of generating instructions for image processing.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the electronic device 1300 may also include an input device 1303 and an output device 1304 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 1303 may be a microphone or a microphone array.
  • the input device 1303 may be a communication network connector for receiving the collected input signals from the first device and the second device.
  • the input device 1303 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 1304 can output various information to the outside, including the determined distance information, direction information, and the like.
  • Output devices 1304 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • the electronic device 1300 may also include any other suitable components according to the specific application.
  • embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary methods" described above in this specification Steps in an image processing method or a method of generating an instruction for image processing according to various embodiments of the present disclosure described in the section.
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be computer-readable storage media having computer program instructions stored thereon that, when executed by a processor, cause the processor to perform the above-described "Example Method" section of this specification Steps in an image processing method or a method for generating an instruction for image processing according to various embodiments of the present disclosure described in .
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may include, for example, but not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses or devices, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • the methods and apparatus of the present disclosure may be implemented in many ways.
  • the methods and apparatus of the present disclosure may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described order of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise.
  • the present disclosure can also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing methods according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
  • each component or each step may be decomposed and/or recombined. These disaggregations and/or recombinations should be considered equivalents of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种图像处理方法、用于图像处理的指令的生成方法及装置。该方法包括:在获得ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配(201);分别对每个图像区块进行图像缩放,以得到多个缩放图像区块;多个缩放图像区块的多个图像尺寸与多个图像尺寸一一对应地相一致(202);将多个缩放图像区块依次输入至图像处理模型(203)。在该方法中,即使图像缩放模块的输出受限,也能够保证视觉图像处理技术涉及的后续流程的正常执行。

Description

图像处理方法、用于图像处理的指令的生成方法及装置
本公开要求在2020年7月31日提交中国专利局、申请号为CN 202010765242.5、发明名称为“图像处理方法、用于图像处理的指令的生成方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及图像处理技术领域,尤其是一种图像处理方法、用于图像处理的指令的生成方法及装置。
背景技术
使用视觉图像处理技术时,在做完全图检测后,会产生针对特定任务的感兴趣区域(Region of Interest,ROI)图像,之后可以通过图像缩放模块对ROI图像进行图像缩放,并将缩放后得到的图像输入至图像处理模型进行处理。很多时候,由于图像缩放模块的输出受限,图像缩放模块针对部分ROI图像并不进行缩放处理,而仅是输出错误提示,这样会导致视觉图像处理技术涉及的后续流程无法正常执行。
发明内容
为了解决上述技术问题,提出了本公开。本公开的实施例提供了一种图像处理方法、用于图像处理的指令的生成方法及装置。
根据本公开实施例的一个方面,提供了一种图像处理方法,包括:
在获得感兴趣区域ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配;
分别对所述多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;所述多个缩放图像区块的多个图像尺寸,以及所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致;
将所述多个缩放图像区块依次输入至所述图像处理模型。
根据本公开实施例的另一个方面,提供了一种用于图像处理的指令的生成方法,包括:
在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于所述硬件输出尺寸和所述第一图像尺寸,确定对具有所述第一图像尺寸的模板图像的拆分方式信息;所述拆分方式信息用于将所述模板图像拆分为多个图像区块,且所述多个图像区块中的每个图像区块的图像尺寸均与所述硬件输出尺寸相匹配;
基于所述拆分方式信息,获得第一拆分数据;
基于所述第一拆分数据,生成用于图像处理的指令,所述用于图像处理的指令用于执行上述图像处理方法。
根据本公开实施例的再一个方面,提供了一种图像处理装置,包括:
拆分模块,用于在获得感兴趣区域ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配;
图像缩放模块,用于分别对所述拆分模块得到的所述多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;所述多个缩放图像区块的多个图像尺寸,以及所述第一 图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致;
输入模块,用于将所述图像缩放模块得到的所述多个缩放图像区块依次输入至所述图像处理模型。
根据本公开实施例的又一个方面,提供了一种用于图像处理的指令的生成装置,包括:
确定模块,用于在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于所述硬件输出尺寸和所述第一图像尺寸,确定对具有所述第一图像尺寸的模板图像的拆分方式信息;所述拆分方式信息用于将所述模板图像拆分为多个图像区块,且所述多个图像区块中的每个图像区块的图像尺寸均与所述硬件输出尺寸相匹配;
获取模块,用于基于所述确定模块确定的所述拆分方式信息,获得第一拆分数据;
生成模块,用于基于所述获取模块获得的所述第一拆分数据,生成用于图像处理的指令,所述用于图像处理的指令用于执行上述图像处理方法。
根据本公开实施例的又一个方面,提供了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行上述图像处理方法,或者用于执行上述用于图像处理的指令的生成方法。
根据本公开实施例的又一个方面,提供了一种电子设备,所述电子设备包括:
处理器;
用于存储所述处理器可执行指令的存储器;
所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述图像处理方法,或者用于执行上述用于图像处理的指令的生成方法。
基于本公开上述实施例提供的图像处理方法、用于图像处理的指令的生成装置、装置、计算机可读存储介质及电子设备,在获得ROI图像的情况下,可以基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块。之后,可以分别对多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块。由于多个缩放图像区块的多个图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致,那么,多个缩放图像区块中的每个缩放图像区块的图像尺寸均与硬件输出尺寸相匹配,每个缩放图像区块均能够从图像缩放模块正常输出,再之后将多个缩放图像区块依次正常输入至图像处理模型即可。可见,本公开的实施例中,即使图像缩放模块的输出受限,利用第一拆分数据,再结合图像拆分操作和图像缩放操作,能够从图像缩放模块正常输出多个缩放图像区块,并将多个缩放图像区块提供给图像处理模型进行正常处理,从而保证视觉图像处理技术涉及的后续流程的正常执行。
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。
附图说明
通过结合附图对本公开实施例进行更详细的描述,本公开的上述以及其他目的、特征和优势将变得更加明显。附图用来提供对本公开实施例的进一步理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开,并不构成对本公开的限制。在附图中,相同的参考标号通常代表相同部件或步骤。
图1是本公开的实施例的总体原理图。
图2是本公开一示例性实施例提供的图像处理方法的流程示意图。
图3是本公开的实施例中运行阶段的拆分示意图。
图4是本公开另一示例性实施例提供的图像处理方法的流程示意图。
图5A是相关技术中用于实现视觉图像处理技术的系统的工作流程图。
图5B是本公开的实施例中用于实现视觉图像处理技术的系统的工作流程图。
图6是本公开一示例性实施例提供的用于图像处理的指令的生成方法的流程示意图。
图7是本公开另一示例性实施例提供的用于图像处理的指令的生成方法的流程示意图。
图8是本公开的实施例的另一总体原理图。
图9是本公开一示例性实施例提供的图像处理装置的结构示意图。
图10是本公开另一示例性实施例提供的图像处理装置的结构示意图。
图11是本公开一示例性实施例提供的用于图像处理的指令的生成装置的结构示意图。
图12是本公开另一示例性实施例提供的用于图像处理的指令的生成装置的结构示意图。
图13是本公开一示例性实施例提供的电子设备的结构图。
具体实施方式
下面将参考附图详细描述根据本公开的示例实施例。所描述的实施例仅是本公开的一部分实施例,而非本公开的全部实施例,本公开不受示例实施例的限制。
应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
本领域技术人员可理解,本公开实施例中的“第一”、“第二”等术语仅用于区别不同步骤、设备或模块等,既不代表任何特定技术含义,也不表示它们之间的必然逻辑顺序。“多个”指两个或两个以上,“至少一个”指一个、两个或两个以上。
还应理解,对于本公开实施例中提及的任一部件、数据或结构,在没有明确限定或者在前后文给出相反启示的情况下,一般可以理解为一个或多个。
另外,本公开中术语“和/或”,仅是一种描述关联对象的关联关系,表示可存在三种关系,例如,A和/或B可以表示单独存在A,同时存在A和B,单独存在B这三种情况。本公开中字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,本公开对各个实施例的描述着重强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,不再一一赘述。同时,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅是说明性的,不作为对本公开及其应用或使用的任何限制。对于相关领域普通技术人员已知的技术、方法和设备不作详细讨论,但适当情况下,该技术、方法和设备可被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本公开实施例可以应用于终端设备、计算机系统、服务器等电子设备,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与终端设备、计算机系统、服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
终端设备、计算机系统、服务器等电子设备可在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等,它们执行特定任务或者实现特定抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
申请概述
视觉图像处理技术可以由人工智能(Artificial Intelligence,AI)视觉图像处理系统实现,该系统执行的操作可以包括:图像采集、图像信号处理(Image Signal Processing,ISP)、图像金字塔生成、全图检测、ROI处理、后处理、输出结果等。
在做完全图检测后,会产生针对特定任务的ROI图像,例如产生针对人脸检测任务的ROI图像(其为从全图中检测出来的人脸图像)、针对行人检测任务的ROI图像等,通过全图检测 产生的ROI图像需要提供给图像处理模型,以便由图像处理模型执行ROI处理;其中,图像处理模型可以为一种卷积神经网络模型。
由于全图检测产生的ROI图像在系统运行时产生,事先没有办法预测位置和大小,而由于模型设计人员的设定,图像处理模型对输入的图像一般是有要求的,例如,图像处理模型对输入的图像有大小要求,因此,有必要将通过全图检测产生的ROI图像缩放至该大小要求一致。具体地,ROI图像的缩放可以由一个硬件模块实现,这个硬件模块可以称为图像缩放模块或者Resizer模块。理论上而言,图像缩放模块的输入大小是通过全图检测产生的ROI图像的大小,图像缩放模块的输出大小与图像处理模型对输入的图像的大小要求一致。
需要指出的是,实际情况中,无论是由中央处理器(Central Processing Unit,CPU)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、图形处理器(Graphics Processing Unit,GPU)还是专用集成电路(Application Specific Integrated Circuit,ASIC)组成的AI视觉图像处理系统中,图像缩放模块的硬件资源都是有限的,图像缩放模块的输出是受限的,这时,图像缩放模块针对部分ROI图像并不进行缩放处理,而仅是输出错误提示,这样会导致视觉图像处理技术涉及的后续流程无法正常执行。
示例性系统
如图1所示,本公开的实施例主要可以包括两个阶段,分别是编译阶段和运行阶段;其中,在编译阶段,可以执行本公开的实施例提供的用于图像处理的指令的生成方法,以生成用于执行本公开的实施例提供的图像处理方法的指令;在运行阶段,针对通过全图检测获得的任意ROI图像,均可以基于编译阶段生成的指令,执行本公开的实施例提供的图像处理方法,以解决由于图像缩放模块的输出受限,导致视觉图像处理技术涉及的后续流程无法正常执行的问题。
示例性方法
图2是本公开一示例性实施例提供的图像处理方法的流程示意图。图2所示的方法包括步骤201、步骤202和步骤203,下面对各步骤分别进行说明。
步骤201,在获得ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配。
在步骤201中,可以先通过全图检测获得ROI图像。具体地,可以通过对图3中左侧部分示意的全图进行检测,以获得图3中高度尺寸为h的图像。
在获得ROI图像之后,可以获得图像处理模型支持的第一图像尺寸,以及在编译阶段确定好的第一拆分数据,第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配。
需要说明的是,图像处理模型支持的第一图像尺寸可以表示图像处理模型对输入的图像的大小要求,图像处理模型支持的第一图像尺寸可以为244×244;图像缩放模块的硬件输出尺寸可以表示图像缩放模块能够输出的最大图像尺寸,图像缩放模块的硬件输出尺寸可以为128×72;任一图像尺寸与图像缩放模块的硬件输出尺寸相匹配可以是指:该图像尺寸小于图像缩放模块的硬件输出尺寸,例如,图像缩放模块的硬件输出尺寸为128×72,而该图像尺寸为76×76。
在一个具体例子中,第一图像尺寸为图3中右侧部分示意的,高度尺寸为H的图像的图像尺寸,第一拆分数据可以用于将第一图像尺寸拆分为6个图像尺寸,分别是ROI 0``的图像尺寸、ROI 1``的图像尺寸、ROI 2``的图像尺寸、ROI 3``的图像尺寸、ROI 4``的图像尺寸、ROI 5``的图像尺寸。在图像缩放模块的硬件输出尺寸为128×72的情况下,ROI 0``至ROI 5``的图像尺寸均需要小于128×72。
在获得第一图像尺寸和第一拆分数据之后,可以基于第一图像尺寸、第一拆分数据,以及所获得的ROI图像,确定拆分对象以及如何进行拆分对象的拆分,并据此进行实际拆分,从而 得到多个图像区块。若第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸为N个图像尺寸,则通过实际拆分得到的多个图像区块可以为N个图像区块,N个图像尺寸与N个图像区块之间可以一一对应。
步骤202,分别对多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;多个缩放图像区块的多个图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致。
在步骤202中,可以调用图像缩放模块,分别对多个图像区块中的每个图像区块进行图像缩放,缩放时需要保证任一图像区块经缩放后得到的缩放图像区块的图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸中,与该图像区块具有对应性的图像尺寸相一致。
仍以图3为例,第一拆分数据用于将第一图像尺寸拆分为6个图像尺寸,这6个图像尺寸分别为ROI 0``至ROI 5``的图像尺寸,步骤201中拆分得到的多个图像区块为ROI 00``至ROI0 5``,通过调用图像缩放模块,可以将ROI 00``缩放至ROI 0``的图像大小,将ROI 01``缩放至ROI 1``的图像大小,将ROI 02``缩放至ROI 2``的图像大小,将ROI 03``缩放至ROI 3``的图像大小,将ROI 04``缩放至ROI 4``的图像大小,以及将ROI 05``缩放至ROI 5``的图像大小,以得到ROI 00``至ROI0 5``各自对应的缩放图像区块,这样总共可以得到6个缩放图像区块。
步骤203,将多个缩放图像区块依次输入至图像处理模型。
在步骤203中,可以将多个缩放图像区块依次输入至图像处理模型,这样,图像处理模型可以分别对每个缩放图像区块进行处理,例如进行人脸检测处理或行人检测处理等,以得到每个缩放图像区块的处理结果,各个缩放图像区块的处理结果可以组成最终的处理结果。由于多个缩放图像区块的多个图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致,最终的处理结果相当于是图像处理模型对某一第一图像尺寸的图像进行处理后得到的结果,最终的处理结果接下来可用于视觉图像处理技术涉及的其他流程。
本公开的实施例中,在获得ROI图像的情况下,可以基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块。之后,可以分别对多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块。由于多个缩放图像区块的多个图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致,那么,多个缩放图像区块中的每个缩放图像区块的图像尺寸均与硬件输出尺寸相匹配,每个缩放图像区块均能够从图像缩放模块正常输出,再之后将多个缩放图像区块依次正常输入至图像处理模型即可。可见,本公开的实施例中,即使图像缩放模块的输出受限,利用第一拆分数据,再结合图像拆分操作和图像缩放操作,能够从图像缩放模块正常输出多个缩放图像区块,并将多个缩放图像区块提供给图像处理模型进行正常处理,从而保证视觉图像处理技术涉及的后续流程的正常执行。
如图4所示,在上述图2所示实施例的基础上,步骤201包括步骤2011、步骤2022和步骤2023。
步骤2011,在所获得的ROI图像满足指定图像对齐条件的情况下,将所获得的ROI图像作为待拆分ROI图像;否则,基于所获得的ROI图像进行图像调整,以得到满足指定图像对齐条件的待拆分ROI图像。
这里,在获得ROI图像之后,可以确定所获得的ROI图像是否满足指定图像对齐条件,下面对具体确定方式进行举例介绍。
在一种具体实施方式中,该方法还包括:
获取第一目标数值;第一目标数值基于图像缩放模块指定的、预设方向的尺寸的数值的约数,以及第一图像尺寸中预设方向的尺寸的数值确定;
判断所获得的ROI图像的预设方向的尺寸的数值是否为第一目标数值的整数倍,以得到第一判断结果;
基于第一判断结果,确定所获得的ROI图像是否满足指定图像对齐条件。
需要说明的是,第一目标数值可以是在编译阶段,基于图像缩放模块指定的、预设方向的尺寸的数值的约数,以及第一图像尺寸中预设方向的尺寸的数值确定的数值,该数值用于表示一种用户对齐要求(为了便于区分,后续将其称为第一用户对齐要求,第一用户对齐要求可以认为是尺寸对齐要求),这里,用户对齐也可以称为user_alignment。第一目标数值的具体确定过程参照下文中编译阶段相应部分的说明即可,在此不做展开。
这里,预设方向可以包括宽度方向和高度方向中的至少一者,由于针对宽度方向和高度方向的运算处理过程可以是类似的,本公开的实施例中仅针对预设方向为高度方向的情况为例进行说明,这时,预设方向的尺寸具体为高度尺寸。
这种实施方式中,可以调用运行时应用程序接口(Runtime Application Programming Interface,Runtime API)来获取编译阶段确定的第一目标数值,并判断所获得的ROI图像的高度尺寸是否为第一目标数值的整数倍,以得到第一判断结果。如果第一判断结果表征所获得的ROI图像的高度尺寸为第一目标数值的整数倍,可以认为所获得的ROI图像满足第一用户对齐要求,在实际拆分时,所获得的ROI图像能够被正确拆分,故可以判定所获得的ROI图像满足指定图像对齐条件;否则,可以判定所获得的ROI图像不满足指定图像对齐条件。
这种实施方式中,基于第一目标数值,能够准确地评估所获得的ROI图像是否满足第一用户对齐要求,从而可靠地确定出所获得的ROI图像是否满足指定图像对齐条件。
在另一种具体实施方式中,该方法还包括:
获取所获得的ROI图像上的预设位置的坐标;
判断预设位置的坐标是否具有图像缩放模块指定的坐标属性,以得到第二判断结果;
基于第二判断结果,确定所获得的ROI图像是否满足指定图像对齐条件。
需要说明的是,图像缩放模块指定的坐标属性可以表示另一种用户对齐要求(为了便于区分,后续将其称为第二用户对齐要求,第二用户对齐要求可以认为是坐标对齐要求)。可选地,图像缩放模块指定的坐标属性可以为:预设位置的坐标为偶数;其中,预设位置可以为左上角位置、右上角位置或者其他位置。
这种实施方式中,在获取所获得的ROI图像上的预设位置的坐标之后,可以判断预设位置的坐标是否具有图像缩放模块指定的坐标属性,以得到第二判断结果。如果第二判断结果表征预设位置的坐标具有图像缩放模块指定的坐标属性,可以认为所获得的ROI图像满足第二用户对齐要求,所获得的ROI图像能够被图像缩放模块正常处理,故可以判定所获得的ROI图像满足指定图像对齐条件;否则,可以判定所获得的ROI图像不满足指定图像对齐条件。
这种实施方式中,基于所获得的ROI图像上的预设位置的坐标,能够准确评估所获得的ROI图像是否满足第二用户对齐要求,从而可靠地确定出所获得的ROI图像是否满足指定图像对齐条件。
需要指出的是,以上两种确定所获得的ROI图像是否满足指定图像对齐条件的实施方式也可以相互结合,例如,可以在所获得的ROI图像的高度尺寸为第一目标数值的整数倍,且所获得的ROI图像具有图像缩放模块指定的坐标属性的情况下,确定所获得的ROI图像满足指定图像对齐条件;否则,确定所获得的ROI图像不满足指定图像对齐条件。
在确定出所获得的ROI图像是否满足指定图像对齐条件之后,如果确定结果为是,可以直接将所获得的ROI图像作为待拆分ROI图像;如果确定结果为否,可以在所获得的ROI图像的基础上进行图像调整,以得到满足指定图像对齐条件的待拆分ROI图像。对于确定结果为否的情况,可以针对所获得的ROI图像在全图上的位置,在全图上取新的ROI图像作为待拆分图像,或者,可以对所获得的ROI图像进行少许裁剪,将经裁剪后的ROI图像作为待拆分ROI图像。
步骤2012,基于第一图像尺寸、第一拆分数据,以及待拆分ROI图像的第二图像尺寸,确定第二拆分数据。
在一种具体实施方式中,步骤2012,包括:
基于第一拆分数据,确定在具有第一图像尺寸的模板图像的预设方向上的各个拆分位置;
确定待拆分ROI图像的第二图像尺寸中预设方向的尺寸,以及第一图像尺寸中预设方向的尺寸的比例关系;
基于比例关系,以及在模板图像的预设方向上的各个拆分位置,确定在待拆分ROI图像的预设方向上的各个拆分位置;
基于在待拆分ROI图像的预设方向上的各个拆分位置,确定第二拆分数据。
这里,第一拆分数据可以包括高度方向上的各个拆分位置坐标信息;和/或,第一拆分数据可以包括在高度方向上拆分出的各尺寸段的比例关系(例如3:3:2、3:4:3等)。这样,基于第一拆分数据,能够便捷可靠地定位出具有第一图像尺寸的模板图像的高度方向上的各个拆分位置。
这里,还可以确定待拆分ROI图像的高度尺寸与第一图像尺寸中高度尺寸的比例关系。假设待拆分ROI图像为图3中左侧部分示意的,高度尺寸为h的图像,具有第一图像尺寸的模板图像为图3中右侧部分示意的,高度尺寸为H的图像,则确定出的比例关系为h/H。
在确定出模板图像的高度方向上的各个拆分位置,以及待拆分ROI图像的高度尺寸与第一图像尺寸中高度尺寸的比例关系之后,可以按照确定出的比例关系,将在模板图像的高度方向上的各个拆分位置映射至待拆分ROI图像上,以得到在待拆分ROI图像的高度方向上的各个映射位置,得到的每个映射位置可以分别作为在待拆分ROI图像的高度方向上的一个拆分位置。
仍以图3为例,在模板图像的高度方向上的拆分位置可以为两个,分别为位置P11和位置P12,那么,可以将位置P11映射至待拆分ROI图像上,对应的映射位置可以为位置P21,还可以将位置P12映射至待拆分ROI图像上,对应的映射位置可以为位置P22。需要指出的是,由于位置P11和位置P12将H分为H 1、H 2、H-H 1-H 2这三段,位置P21和位置P22会将h分为h 1、h 2、h-h 1-h 2这三段,由于是按照比例关系进行映射的,h 1和h 2满足:h 1=H 1h/H,h 2=H 2h/H。
在确定在待拆分ROI图像的高度方向上的各个拆分位置之后,可以据此确定第二拆分数据。可选地,第二拆分数据可以包括在待拆分ROI图像的高度方向上的各个拆分位置的拆分位置坐标信息;和/或,第二拆分数据可以包括在待拆分ROI图像的高度方向上拆分出的各尺寸段的比例关系,例如h 1:h 2:h-h 1-h 2
这种实施方式中,基于第一拆分数据,能够准确可靠地定位出在具有第一图像尺寸的模板图像上的拆分位置,再结合待拆分ROI图像与第一图像尺寸的比例关系,能够准确可靠地定位出在待拆分ROI图像上的拆分位置,以便据此得到第二拆分数据。
步骤2013,按照第二拆分数据,对待拆分ROI图像进行拆分,以得到多个图像区块。
由于在待拆分ROI图像的高度方向上的各个拆分位置与第二拆分数据之间紧密关联,按照第二拆分数据,即可在高度方向上的各个拆分位置处进行拆分,按照类似的方式,还可以针对宽度方向进行拆分,从而基于高度方向和宽度方向的拆分,得到多个图像区块。
本公开的实施例中,可以先基于所获得的ROI图像,得到满足指定图像对齐条件的待拆分ROI图像,然后确定用于指示如何进行拆分的第二拆分数据,之后再根据第二拆分数据,对待拆分ROI图像进行实际拆分,这样,拆分对象是满足用户对齐要求的,且实际拆分时利用的拆分数据与拆分对象是适配的,从而能够保证拆分操作正确有效地实施。
需要指出的是,相关技术中,如图5A所示,AI视觉图像处理系统的工作流程可以为:先依次进行图像采集、ISP、图像金字塔生成和全图检测,以产生针对特定任务的ROI图像;接下来在将通过全图检测产生的ROI图像提供给图像缩放模块之前,先预测全图检测产生的ROI图像通过图像缩放模块进行缩放时,得到的缩放图像(假设其表示为ROI`)是否满足限制,如果满足限制,则将全图检测产生的ROI图像送到图像缩放模块执行缩放操作来得到ROI`,图像处理模型进行ROI`的处理,再之后,后处理和输出结果等操作也会被执行;如果不满足限制,输出错误信息、报警信号或者告警信号等错误提示。容易看出,相关技术中针对图像缩放模块输出受限的情况,仅进行提示信号的输出,而未采取有效的解决措施。
与相关技术相比,本公开的实施例中,如图5B所示,AI视觉图像处理系统的工作流程可以有一些不同,具体为:如果预测出ROI`不满足限制,可以执行对齐处理(以满足上述的用户对齐要求)和拆分处理(例如拆分为图像缩放模块能够输出大小的若干份),这样,本公开的实施例能够在图像缩放模块输出受限的情况下,保证视觉图像处理技术涉及的后续流程的正常执行。
本公开的实施例提供的任一种图像处理方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本公开的实施例提供的任一种图像处理方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本公开的实施例提及的任一种图像处理方法。下文不再赘述。
图6是本公开一示例性实施例提供的用于图像处理的指令的生成方法的流程示意图。图6所示的方法包括步骤601、步骤602和步骤603,下面对各步骤分别进行说明。
步骤601,在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于硬件输出尺寸和第一图像尺寸,确定对具有第一图像尺寸的模板图像的拆分方式信息;拆分方式信息用于将模板图像拆分为多个图像区块,且多个图像区块中的每个图像区块的图像尺寸均与硬件输出尺寸相匹配。
需要说明的是,图像缩放模块的硬件输出尺寸可以表示图像缩放模块能够输出的最大图像尺寸;图像处理模型支持的第一图像尺寸可以表示图像处理模型对输入的图像的大小要求;图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配可以是指:图像缩放模块的硬件输出尺寸小于图像处理模型支持的第一图像尺寸,例如,图像缩放模块的硬件输出尺寸为128×72,而图像处理模型支持的第一图像尺寸为244×244;任一图像尺寸与图像缩放模块的硬件输出尺寸相匹配可以是指:该图像尺寸小于图像缩放模块的硬件输出尺寸。
在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,可以基于硬件输出尺寸和第一图像尺寸,确定如何对第一图像尺寸的模板图像(例如图3中右侧部分示意的,高度尺寸为H的图像)进行拆分,能够保证拆分得到的多个图像区块中的每个图像区块的图像尺寸均与图像缩放模块的硬件输出尺寸相匹配,从而得到相应的拆分方式信息。
可选地,拆分方式信息可以包括宽度方向上的各个拆分位置坐标信息以及高度方向上的各个拆分位置坐标信息;和/或,拆分方式信息可以包括在宽度方向上拆分出的各尺寸段的比例关系以及在高度方向上拆分出的各尺寸段的比例关系。
可选地,在确定拆分方式信息时,除了参考图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸之外,还可以参考其他因素,例如参考使用模板图像的具体指令类型、模板图像在内存中的存储位置,编译器在运行时对内存的分配和管理规则等。
步骤602,基于拆分方式信息,获得第一拆分数据。
这里,可以直接将包括拆分方式信息中的所有信息的数据作为第一拆分数据;或者,可以从拆分方式信息中提取部分信息,并将包括所提取的部分信息的数据作为第一拆分数据,例如,在拆分方式信息中同时包括拆分位置坐标信息和尺寸段的比例关系的情况下,可以从拆分方式信息中提取拆分位置坐标信息,并将包括所提取的拆分位置坐标信息的数据作为第一拆分数据。
步骤603,基于第一拆分数据,生成用于图像处理的指令,用于图像处理的指令用于执行图像处理方法(具体为上述实施例中公开的图像处理方法)。
需要说明的是,步骤601至步骤603均可以由编译器执行。
本公开的实施例中,在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,可以基于硬件输出尺寸和第一图像尺寸,确定对具有第一图像尺寸的模板图像的拆分方式信息,基于拆分方式信息,获得第一拆分数据,之后,可以基于第一拆分数据,生成用于图像处理的指令,该指令可以用于执行上述图像处理方法。这样,即使图像缩放模块的输出受限,通过执行上述图像处理方法,能够利用第一拆分数据,并结合图像拆分操作和图 像缩放操作,保证视觉图像处理技术涉及的后续流程的正常执行。
如图7所示,在上述图6所示实施例的基础上,步骤601包括步骤6011、步骤6012和步骤6013。
步骤6011,在图像缩放模块指定了预设方向的尺寸的数值的约数的情况下,计算第一乘积;第一乘积为约数与第一图像尺寸中预设方向的尺寸的数值的乘积。
这里,预设方向可以包括宽度方向和高度方向中的至少一者,由于针对宽度方向和高度方向的运算处理过程可以是类似的,本公开的实施例中仅针对预设方向为高度方向的情况为例进行说明,这时,预设方向的尺寸具体为高度尺寸。
这里,图像缩放模块指定的高度尺寸的约数可以用c表示,第一图像尺寸的数值可以用图3中的H表示,则第一乘积可以表示为cH;其中,c可以认为是图像缩放模块的硬件对齐要求。
步骤6012,基于第一乘积,确定在具有第一图像尺寸的模板图像上的各个拆分位置。
在一种具体实施方式中,
步骤6012,包括:
基于第一乘积,确定第一目标数值和第二目标数值;第一目标数值和第二目标数值的乘积与第一乘积满足预设关系;
基于第二目标数值,确定在具有第一图像尺寸的模板图像上的各个拆分位置;模板图像的预设方向上,由确定出的各个拆分位置划分出的每个尺寸段的尺寸的数值均为第二目标数值的整数倍;
该方法还包括:
记录第一目标数值。
需要说明的是,任一乘积与第一乘积满足预设关系可以是指:任一乘积与第一乘积相等。
这种实施方式中,在基于第一乘积,确定第一目标乘积和第二目标乘积之后,可以记录第一目标数据,这样,在运行阶段能够获取到记录的第一目标数值,以便基于第一目标数据,进行指定图像对齐条件相关的判定。另外,可以基于第二目标数值,确定在具有第一图像尺寸的模板图像上的各个拆分位置,确定时需要保证模板图像的高度方向上,由确定出的各个拆分位置划分出的每个尺寸段的尺寸的数值均为第二目标数值的整数倍,例如,对于图3而言,需要保证H 1、H 2、H-H 1-H 2均为第二目标数值的整数倍。
假设图3满足如下条件:
(1)H 1和H 2满足A对齐,则有:H 1=n 1A,H 2=n 2A,即H 1和H 2均为A的整数倍;
(2)h满足a对齐,则有:h=n 3a,即h为a的整数倍;
(3)h 1至少满足图像缩放模块的硬件对齐要求,则有:h 1=n 4c,即h 1为c的整数倍;
(4)图像缩放模块对左侧部分示意的,高度尺寸为h的图像等比例缩放。
这样可以有:
Figure PCTCN2020131057-appb-000001
进一步简化可以有:
Figure PCTCN2020131057-appb-000002
由于n 1和n 3是不可控的正整数,要保证h 1满足c对齐,就需要保证n 4为一个正整数,所以必 须保证
Figure PCTCN2020131057-appb-000003
为正整数,在对齐要求最严格的情况下,n 1=n 3=n 4=1,这样更进一步简化上式可得:
aA=cH
需要说明的是,A可以认为是编译器拆分点(其对应着拆分位置)的尺寸对齐值,a可以认为是提供给图像缩放模块的ROI图像(例如上文中的待拆分ROI图像)的尺寸对齐值,通过上述公式反推可知,只要编译器拆分点的尺寸对齐值A和用户提供给图像缩放模块的ROI图像的尺寸对齐值a的乘积等于图像缩放模块的硬件对齐要求c与图像处理模型支持的第一图像尺寸中相应方向的尺寸(这里具体为高度方向的尺寸H)的乘积(该乘积可以表示为cH,该乘积相当于上文中的第一乘积),就可以保证将模板图像上的拆分位置映射回待拆分ROI图像上时,待拆分ROI图像上的拆分位置满足尺寸对齐要求(具体为满足a对齐),这样,只要待拆分ROI图像满足a对齐,待拆分ROI图像就能够正确高效地得到计算。
有鉴于此,可以将a作为第一目标数值,并将A作为第二目标数值,以保证在运行阶段,待拆分ROI图像能够被正确拆分,拆分得到的各图像区块能够满足尺寸对齐要求,也即,拆分得到的各图像区块的高度尺寸均为作为第一目标数值的a的整数倍。
步骤6013,基于所确定的各个拆分位置,确定对具有第一图像尺寸的模板图像的拆分方式信息。
在确定出各个拆分位置之后,可以获取对应的拆分位置坐标信息,从而得到包括所获取的拆分位置坐标信息的拆分方式信息,当然,也可以获取各尺寸段的比例关系,从而得到包括所获取的比例关系的拆分方式信息。
本公开的实施例中,在图像缩放模块指定了预设方向的尺寸的数值的约数的情况下,可以进行第一乘积(即上文中的cH)的计算,并以第一乘积为指导信息,确定合适的第一目标数值和第二目标数值,以保证在运行阶段能够正确进行图像拆分,且保证拆分结果满足尺寸对齐要求。
在一个可选示例中,基于第一乘积,确定第一目标数值和第二目标数值,包括:
基于第一乘积,确定至少一个参考参数组;至少一个参考参数组中的每个参考参数组中均包括两个数值,且任一参考参数组中的两个数值的乘积与第一乘积满足预设关系;
从至少一个参考参数组中选择第一参考参数组;
将第一参考参数组中的一个数值作为第一目标数值,并将第一参考参数组中的另一个数值作为第二目标数值。
这里,可以先确定至少一个参考参数组,每个参考参数组中均包括两个数值,且任一参考参数组中的两个数值的乘积与第一乘积相等。可选地,至少一个参考参数组可以通过对第一乘积进行因式分解得到。
接下来,可以从至少一个参考参数组中选择第一参考参数组。可选地,可以按照设定规则,从至少一个参考参数组中选择一个参考参数组作为第一参考参数组,这时,从至少一个参考参数组中选择第一参考参数组,可以包括:
分别计算至少一个参考参数组中的每个参考参数组的总对齐代价值;
选择所对应总对齐代价值最小的参考参数组作为第一参考参数组。
这里,可以分别计算每个参考参数组的总对齐代价值,总对齐代价值可以包括编译器的对齐代价值和用户的对齐代价值。接下来,通过对各参考参数组的总对齐代价值进行大小比较,可以筛选出所对应总对齐代价值最小的参考参数组,之后将筛选出的参考参数组作为第一参考参数组即可。
这种实施方式中,通过将总对齐代价值最小的参考参数组作为第一参考参数组,可以在保证编译器和用户均满足相应对齐要求的情况下,付出的代价最小。
当然,选择第一参考参数组的实施方式并不局限于此,例如,可以随机地从至少一个参考参数组中选择一个参考参数组作为第一参考参数组。
在筛选出第一参考参数组之后,即可便捷地确定出第一目标数值和第二目标数值。由于第一参考参数组是从至少一个参考参数组中选择的,至少一个参考参数组是基于乘积确定的,这样能够保证编译器和用户均满足相应对齐要求。
在一个可选示例中,分别计算至少一个参考参数组中的每个参考参数组的总对齐代价值,包括:
基于至少一个参考参数组中的任一参考参数组包括的两个数值中较大的数值,确定在模板图像的预设方向上的各个拆分位置,并计算各个拆分位置的对齐代价值;
计算所得到的各个对齐代价值的总和,以得到任一参考参数组的总对齐代价值。
这里,针对至少一个参考参数组中的任一参考参数组(假设为参考参数组S),可以先确定其包括的两个数值中较大的数值(假设其为A′);接下来,可以利用A′,确定在模板图像的高度方向上的各个拆分位置,确定时需要保证模板图像的高度方向上,由确定出的各个拆分位置划分出的每个尺寸段的尺寸的数值均为A′的整数倍;之后,可以计算模板图像的高度方向上的各个拆分位置的对齐代价值。
在一种具体实施方式中,计算各个拆分位置的对齐代价值,包括:
获取任一拆分位置的坐标中预设方向的坐标;
计算所获取的坐标,以及任一参考参数组包括的两个数值中较大的数值的取余结果;
计算第一乘积,以及任一参考参数组包括的两个数值中较大的数值的比值;
计算比值与预设数值的差值;
基于预设调整因子,对取余结果和差值进行加权求和,以得到任一拆分位置的对齐代价值。
这里,预设数值可以为1、2、3或者其他取值,预设调整因子可以表示为p。
仍以任一参考参数组为参考参数组S,任一参考参数组包括的两个数值中较大的数值为A′的情况为例,假设利用A′确定出的模板图像的高度方向上,某一拆分位置的坐标中高度方向的坐标表示为y,则可以计算y与A′的取余结果,取余结果可以表示为roi.y%A′。另外,还可以计算第一乘积cH与A′的比值,并计算比值与预设数值(假设其为1)的差值,差值可以表示为
Figure PCTCN2020131057-appb-000004
之后,可以基于预设调整因子p,对取余结果roi.y%A′和差值
Figure PCTCN2020131057-appb-000005
进行加权求和,具体地,在加权求和时,可以将p作为取余结果roi.y%A′的权重,将p-1作为差值
Figure PCTCN2020131057-appb-000006
的权重,以得到某一拆分位置的对齐代价值,该拆分位置的对齐代价值可以表示为f(A′),该拆分位置的对齐代价值可以利用下述代价函数计算得到:
Figure PCTCN2020131057-appb-000007
需要说明的是,上述代价函数中,“+”前面的运算项是编译器的对齐代价值,“+”后面的运算项是理论上最坏情况下用户的对齐代价值,利用上述代价函数,能够准确可靠地确定出单 个拆分位置的对齐代价值。
假设利用A′确定出了模板图像的高度方向上的G个拆分位置,则可以按照上述方式,分别计算这G个拆分位置中的每个拆分位置的对齐代价值,以得到G个对齐代价值,之后,可以计算这G个对齐代价值的总和,并将该总和作为参考参数组S的总对齐代价值。需要指出的是,后续可以将各参考参数组的总对齐代价值进行比较,如果参考参数组S的总对齐代价值是最小的,则可以将参考参数组S作为第一参考参数组,这时,A′即可作为上文中的第二目标数值或A,
Figure PCTCN2020131057-appb-000008
可以作为上文中的第二目标数值或a。
本公开的实施例中,通过对齐代价值的求和,能够便捷可靠地得到各参考参数组的总对齐代价值,从而有利于确定合理的第一目标数值和第二目标数值。
需要指出的是,为了不让编译器为对齐付出过多代价,也可以尝试采用启发式的方式寻找合适的A。例如从4这个数值开始尝试,依次尝试,确定基于其对模板图像进行拆分时,编译器和用户是否均能够满足相应对齐要求;若能够满足,则在该数值地基础上翻倍,例如翻倍为8、16、32等,以继续尝试较大的数值是否能够满足相应的对齐要求,找到不大于32且能够使对齐要求满足的最大数值。
在一个具体例子中,图像处理模型支持的第一图像尺寸为66×68,即宽度尺寸为66,高度尺寸为68,同时图像缩放模块要求输入图像的左上角坐标、高度尺寸和宽度尺寸均为偶数,则拆分时,编译器拆分点的尺寸对齐值A及提供给图像缩放模块的ROI图像的尺寸对齐值a应该满足:
aA=2*66=132
容易看出,若编译器满足2对齐(即A的取值为2),用户需满足66(即a的取值为66)对齐;若编译器满足4对齐,用户需满足33对齐;若编译器满足12对齐,用户需满足11对齐。可选地,在采用启发式的方式寻找合适的A时,A具体可以为4,也即,编译器满足4对齐,用户满足33对齐。
在一个可选示例中,如图8所示,在编译阶段,需要将图像缩放模块需要满足的对齐要求(例如左上角坐标、高度尺寸和宽度尺寸均为偶数的要求等)、图像缩放模块的硬件输出尺寸,以及图像处理模型支持的第一图像尺寸提供给编译器,编译器据此计算编译器需要满足的对齐A,同时计算用户需要满足的对齐a,记录a,并进行第一图像尺寸的模板图像的拆分,以得到第一拆分数据。在运行阶段,用户可以通过Runtime API拿到a,接下来可以进行相应的对齐处理,以得到待拆分ROI图像并进行后续处理,从而保证对待拆分ROI图像的正确有效拆分。
综上,本公开的实施例中,编译器可以通过对图像缩放模块的硬件限制和图像处理模型的输入要求进行分析,计算如何在付出较小代价的情况下满足硬件限制;编译器还可以给出拆分和对齐策略,以据此进行合理拆分和对齐,保证在图像缩放模块硬件资源受限的情况下,整个AI视觉图像处理系统能够正确、高效地运行,从而较好地保证用户的使用体验。
本公开实施例提供的任一种用于图像处理的指令的生成方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本公开实施例提供的任一种用于图像处理的指令的生成方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本公开实施例提及的任一种用于图像处理的指令的生成方法。下文不再赘述。
示例性装置
图9是本公开一示例性实施例提供的图像处理装置的结构示意图。图9所示的装置包括拆分模块901、图像缩放模块902、输入模块903。
拆分模块901,用于在获得ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块902的硬件输出尺寸相匹配;
图像缩放模块902,用于分别对拆分模块901得到的多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;多个缩放图像区块的多个图像尺寸,以及第一图像尺寸基于第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致;
输入模块903,用于将图像缩放模块902得到的多个缩放图像区块依次输入至图像处理模型。
在一个可选示例中,如图10所示,拆分模块901,包括:
第一获取子模块9011,用于在所获得的ROI图像满足指定图像对齐条件的情况下,将所获得的ROI图像作为待拆分ROI图像;否则,基于所获得的ROI图像进行图像调整,以得到满足指定图像对齐条件的待拆分ROI图像;
确定子模块9012,用于基于第一图像尺寸、第一拆分数据,以及第一获取子模块9011获得的待拆分ROI图像的第二图像尺寸,确定第二拆分数据;
第二获取子模块9013,用于按照确定子模块9012确定的第二拆分数据,对第一获取子模块9011获得的待拆分ROI图像进行拆分,以得到多个图像区块。
在一个可选示例中,确定子模块9012,包括:
第一确定单元,用于基于第一拆分数据,确定在具有第一图像尺寸的模板图像的预设方向上的各个拆分位置;
第二确定单元,用于确定待拆分ROI图像的第二图像尺寸中预设方向的尺寸,以及第一图像尺寸中预设方向的尺寸的比例关系;
第三确定单元,用于基于第二确定单元确定的比例关系,以及第一确定单元确定的在模板图像的预设方向上的各个拆分位置,确定在待拆分ROI图像的预设方向上的各个拆分位置;
第四确定单元,用于基于第三确定单元确定的在待拆分ROI图像的预设方向上的各个拆分位置,确定第二拆分数据。
在一个可选示例中,
如图10所示,该装置还包括:
第一获取模块904,用于获取第一目标数值;第一目标数值基于图像缩放模块902指定的、预设方向的尺寸的数值的约数,以及第一图像尺寸中预设方向的尺寸的数值确定;
第二获取模块905,用于判断所获得的ROI图像的预设方向的尺寸的数值是否为第一获取模块904获取的第一目标数值的整数倍,以得到第一判断结果;
第一确定模块906,用于基于第二获取模块905获取的第一判断结果,确定所获得的ROI图像是否满足指定图像对齐条件;
如图10所示,该装置还包括:
第三获取模块907,用于获取所获得的ROI图像上的预设位置的坐标;
第四获取模块908,用于判断第三获取模块907获取的预设位置的坐标是否具有图像缩放模块902指定的坐标属性,以得到第二判断结果;
第二确定模块909,用于基于第四获取模块908获取的第二判断结果,确定所获得的ROI图像是否满足指定图像对齐条件。
图11是本公开一示例性实施例提供的用于图像处理的指令的生成装置的结构示意图。图11所示的装置包括确定模块1101、获取模块1102、生成模块1103。
确定模块1101,用于在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于硬件输出尺寸和第一图像尺寸,确定对具有第一图像尺寸的模板图像的拆分方式信息;拆分方式信息用于将模板图像拆分为多个图像区块,且多个图像区块中的每个图像区块的图像尺寸均与硬件输出尺寸相匹配;
获取模块1102,用于基于确定模块1101确定的拆分方式信息,获得第一拆分数据;
生成模块1103,用于基于获取模块1102获得的第一拆分数据,生成用于图像处理的指令,用于图像处理的指令用于执行上述图像处理方法。
在一个可选示例中,如图12所示,确定模块1101,包括:
计算子模块11011,用于在图像缩放模块指定了预设方向的尺寸的数值的约数的情况下,计算第一乘积;第一乘积为约数与第一图像尺寸中预设方向的尺寸的数值的乘积;
第一确定子模块11012,用于基于计算子模块11011计算的第一乘积,确定在具有第一图像尺寸的模板图像上的各个拆分位置;
第二确定子模块11013,用于基于第一确定子模块11012确定的各个拆分位置,确定对具有第一图像尺寸的模板图像的拆分方式信息。
在一个可选示例中,
第一确定子模块11012,包括:
第五确定单元,用于基于第一乘积,确定第一目标数值和第二目标数值;第一目标数值和第二目标数值的乘积与第一乘积满足预设关系;
第六确定单元,用于基于第五确定单元确定的第二目标数值,确定在具有第一图像尺寸的模板图像上的各个拆分位置;模板图像的预设方向上,由确定出的各个拆分位置划分出的每个尺寸段的尺寸的数值均为第二目标数值的整数倍;
该装置还包括:
记录模块,用于记录第五确定单元确定的第一目标数值。
在一个可选示例中,第五确定单元,包括:
第一确定子单元,用于基于第一乘积,确定至少一个参考参数组;至少一个参考参数组中的每个参考参数组中均包括两个数值,且任一参考参数组中的两个数值的乘积与第一乘积满足预设关系;
选择子单元,用于从第一确定子单元确定的至少一个参考参数组中选择第一参考参数组;
第二确定子单元,用于将选择子单元选择的第一参考参数组中的一个数值作为第一目标数值,并将选择子单元选择出的第一参考参数组中的另一个数值作为第二目标数值。
在一个可选示例中,选择子单元,具体用于分别计算至少一个参考参数组中的每个参考参数组的总对齐代价值;选择所对应总对齐代价值最小的参考参数组作为第一参考参数组。
在一个可选示例中,选择子单元,具体用于基于至少一个参考参数组中的任一参考参数组包括的两个数值中较大的数值,确定在模板图像的预设方向上的各个拆分位置,并计算各个拆分位置的对齐代价值;计算所得到的各个对齐代价值的总和,以得到任一参考参数组的总对齐代价值。
在一个可选示例中,选择子单元,具体用于获取任一拆分位置的坐标中预设方向的坐标;计算所获取的坐标,以及任一参考参数组包括的两个数值中较大的数值的取余结果;计算第一乘积,以及任一参考参数组包括的两个数值中较大的数值的比值;计算比值与预设数值的差值;基于预设调整因子,对取余结果和差值进行加权求和,以得到任一拆分位置的对齐代价值。
示例性电子设备
下面,参考图13来描述根据本公开实施例的电子设备。该电子设备可以是第一设备和第二设备中的任一个或两者、或与它们独立的单机设备,该单机设备可以与第一设备和第二设备进行通信,以从它们接收所采集到的输入信号。
图13图示了根据本公开实施例的电子设备的框图。
如图13所示,电子设备1300包括一个或多个处理器1301和存储器1302。
处理器1301可以是中央处理单元或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备1300中的其他组件以执行期望的功能。
存储器1302可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或 多个计算机程序指令,处理器1301可以运行所述程序指令,以实现上文所述的本公开的各个实施例中的图像处理方法或者用于图像处理的指令的生成方法。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。
在一个示例中,电子设备1300还可以包括:输入装置1303和输出装置1304,这些组件通过总线系统和/或其他形式的连接机构(未示出)互连。
例如,在电子设备1300是第一设备或第二设备时,输入装置1303可以是麦克风或麦克风阵列。在电子设备1300是单机设备时,输入装置1303可以是通信网络连接器,用于从第一设备和第二设备接收所采集的输入信号。
此外,输入装置1303还可以包括例如键盘、鼠标等等。输出装置1304可以向外部输出各种信息,包括确定出的距离信息、方向信息等。输出装置1304可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。
当然,为了简化,图13中仅示出了电子设备1300中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备1300还可以包括任何其他适当的组件。
示例性计算机程序产品和计算机可读存储介质
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的图像处理方法或用于图像处理的指令的生成方法中的步骤。
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的图像处理方法或用于图像处理的指令的生成方法方法中的步骤。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本公开中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇 “或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“诸如但不限于”,且可与其互换使用。
可能以许多方式来实现本公开的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。
还需要指出的是,在本公开的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本公开。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本公开的范围。因此,本公开不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本公开的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。

Claims (10)

  1. 一种图像处理方法,包括:
    在获得感兴趣区域ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配;
    分别对所述多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;所述多个缩放图像区块的多个图像尺寸,以及所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致;
    将所述多个缩放图像区块依次输入至所述图像处理模型。
  2. 根据权利要求1所述的方法,其中,所述基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块,包括:
    在所获得的ROI图像满足指定图像对齐条件的情况下,将所获得的ROI图像作为待拆分ROI图像;否则,基于所获得的ROI图像进行图像调整,以得到满足所述指定图像对齐条件的待拆分ROI图像;
    基于所述第一图像尺寸、所述第一拆分数据,以及所述待拆分ROI图像的第二图像尺寸,确定第二拆分数据;
    按照所述第二拆分数据,对所述待拆分ROI图像进行拆分,以得到多个图像区块。
  3. 根据权利要求2所述的方法,其中,所述基于所述第一图像尺寸、所述第一拆分数据,以及所述待拆分ROI图像的第二图像尺寸,确定第二拆分数据,包括:
    基于所述第一拆分数据,确定在具有所述第一图像尺寸的模板图像的预设方向上的各个拆分位置;
    确定所述待拆分ROI图像的第二图像尺寸中所述预设方向的尺寸,以及所述第一图像尺寸中所述预设方向的尺寸的比例关系;
    基于所述比例关系,以及在所述模板图像的所述预设方向上的各个拆分位置,确定在所述待拆分ROI图像的所述预设方向上的各个拆分位置;
    基于在所述待拆分ROI图像的所述预设方向上的各个拆分位置,确定第二拆分数据。
  4. 一种用于图像处理的指令的生成方法,包括:
    在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于所述硬件输出尺寸和所述第一图像尺寸,确定对具有所述第一图像尺寸的模板图像的拆分方式信息;所述拆分方式信息用于将所述模板图像拆分为多个图像区块,且所述多个图像区块中的每个图像区块的图像尺寸均与所述硬件输出尺寸相匹配;
    基于所述拆分方式信息,获得第一拆分数据;
    基于所述第一拆分数据,生成用于图像处理的指令,所述用于图像处理的指令用于执行上述权利要求1-3中任一所述的图像处理方法。
  5. 根据权利要求4所述的方法,其中,所述基于所述硬件输出尺寸和所述第一图像尺寸,确定对具有所述第一图像尺寸的模板图像的拆分方式信息,包括:
    在所述图像缩放模块指定了预设方向的尺寸的数值的约数的情况下,计算第一乘积;所述第一乘积为所述约数与所述第一图像尺寸中所述预设方向的尺寸的数值的乘积;
    基于所述第一乘积,确定在具有所述第一图像尺寸的模板图像上的各个拆分位置;
    基于所确定的各个拆分位置,确定对具有所述第一图像尺寸的模板图像的拆分方式信息。
  6. 根据权利要求5所述的方法,其中,所述基于所述第一乘积,确定在具有所述第一图像尺寸的模板图像上的各个拆分位置,包括:
    基于所述第一乘积,确定第一目标数值和第二目标数值;所述第一目标数值和所述第二目 标数值的乘积与所述第一乘积满足预设关系;
    基于所述第二目标数值,确定在具有所述第一图像尺寸的模板图像上的各个拆分位置;所述模板图像的所述预设方向上,由确定出的各个拆分位置划分出的每个尺寸段的尺寸的数值均为所述第二目标数值的整数倍;
    所述方法还包括:
    记录所述第一目标数值。
  7. 一种图像处理装置,包括:
    拆分模块,用于在获得感兴趣区域ROI图像的情况下,基于图像处理模型支持的第一图像尺寸、第一拆分数据,以及所获得的ROI图像,拆分得到多个图像区块;所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸中的每个图像尺寸均与图像缩放模块的硬件输出尺寸相匹配;
    图像缩放模块,用于分别对所述拆分模块得到的所述多个图像区块中的每个图像区块进行图像缩放,以得到多个缩放图像区块;所述多个缩放图像区块的多个图像尺寸,以及所述第一图像尺寸基于所述第一拆分数据进行拆分得到的多个图像尺寸一一对应地相一致;
    输入模块,用于将所述图像缩放模块得到的所述多个缩放图像区块依次输入至所述图像处理模型。
  8. 一种用于图像处理的指令的生成装置,包括:
    确定模块,用于在图像缩放模块的硬件输出尺寸与图像处理模型支持的第一图像尺寸不匹配的情况下,基于所述硬件输出尺寸和所述第一图像尺寸,确定对具有所述第一图像尺寸的模板图像的拆分方式信息;所述拆分方式信息用于将所述模板图像拆分为多个图像区块,且所述多个图像区块中的每个图像区块的图像尺寸均与所述硬件输出尺寸相匹配;
    获取模块,用于基于所述确定模块确定的所述拆分方式信息,获得第一拆分数据;
    生成模块,用于基于所述获取模块获得的所述第一拆分数据,生成用于图像处理的指令,所述用于图像处理的指令用于执行上述权利要求1-3中任一所述的图像处理方法。
  9. 一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序用于执行上述权利要求1-3中任一所述的图像处理方法,或者用于执行上述权利要求4-6中任一所述的用于图像处理的指令的生成方法。
  10. 一种电子设备,所述电子设备包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    所述处理器,用于从所述存储器中读取所述可执行指令,并执行所述指令以实现上述权利要求1-3中任一所述的图像处理方法,或者用于执行上述权利要求4-6中任一所述的用于图像处理的指令的生成方法。
PCT/CN2020/131057 2020-07-31 2020-11-24 图像处理方法、用于图像处理的指令的生成方法及装置 WO2022021695A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP20947620.9A EP4020316A4 (en) 2020-07-31 2020-11-24 IMAGE PROCESSING METHOD, AND METHOD AND APPARATUS FOR GENERATING AN IMAGE PROCESSING INSTRUCTION
JP2022520212A JP7369288B2 (ja) 2020-07-31 2020-11-24 画像処理方法、画像処理用コマンドの生成方法および装置
US17/764,409 US20220351329A1 (en) 2020-07-31 2020-11-24 Image Processing Method, Method for Generating Instructions for Image Processing and Apparatuses Therefor

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010765242.5 2020-07-31
CN202010765242.5A CN111860694A (zh) 2020-07-31 2020-07-31 图像处理方法、用于图像处理的指令的生成方法及装置

Publications (1)

Publication Number Publication Date
WO2022021695A1 true WO2022021695A1 (zh) 2022-02-03

Family

ID=72954307

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131057 WO2022021695A1 (zh) 2020-07-31 2020-11-24 图像处理方法、用于图像处理的指令的生成方法及装置

Country Status (5)

Country Link
US (1) US20220351329A1 (zh)
EP (1) EP4020316A4 (zh)
JP (1) JP7369288B2 (zh)
CN (1) CN111860694A (zh)
WO (1) WO2022021695A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860694A (zh) * 2020-07-31 2020-10-30 地平线(上海)人工智能技术有限公司 图像处理方法、用于图像处理的指令的生成方法及装置
CN113610701B (zh) * 2021-08-04 2023-12-26 同方鼎欣科技股份有限公司 图像分页转换方法、装置、计算机设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9060171B2 (en) * 2010-04-28 2015-06-16 Hon Hai Precision Industry Co., Ltd. Image processing system and method
CN107801026A (zh) * 2017-11-09 2018-03-13 京东方科技集团股份有限公司 图像压缩方法及装置、图像压缩及解压缩系统
CN108537729A (zh) * 2018-03-27 2018-09-14 珠海全志科技股份有限公司 图像无级缩放方法、计算机装置及计算机可读存储介质
CN110232657A (zh) * 2019-06-17 2019-09-13 深圳市迅雷网络技术有限公司 一种图像缩放方法、装置、设备及介质
CN111860694A (zh) * 2020-07-31 2020-10-30 地平线(上海)人工智能技术有限公司 图像处理方法、用于图像处理的指令的生成方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9928406B2 (en) * 2012-10-01 2018-03-27 The Regents Of The University Of California Unified face representation for individual recognition in surveillance videos and vehicle logo super-resolution system
JP6092082B2 (ja) * 2012-11-30 2017-03-08 株式会社沖データ 画像処理装置及び方法、並びに、画像形成装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9060171B2 (en) * 2010-04-28 2015-06-16 Hon Hai Precision Industry Co., Ltd. Image processing system and method
CN107801026A (zh) * 2017-11-09 2018-03-13 京东方科技集团股份有限公司 图像压缩方法及装置、图像压缩及解压缩系统
CN108537729A (zh) * 2018-03-27 2018-09-14 珠海全志科技股份有限公司 图像无级缩放方法、计算机装置及计算机可读存储介质
CN110232657A (zh) * 2019-06-17 2019-09-13 深圳市迅雷网络技术有限公司 一种图像缩放方法、装置、设备及介质
CN111860694A (zh) * 2020-07-31 2020-10-30 地平线(上海)人工智能技术有限公司 图像处理方法、用于图像处理的指令的生成方法及装置

Also Published As

Publication number Publication date
JP7369288B2 (ja) 2023-10-25
EP4020316A4 (en) 2023-09-06
US20220351329A1 (en) 2022-11-03
CN111860694A (zh) 2020-10-30
EP4020316A1 (en) 2022-06-29
JP2022551249A (ja) 2022-12-08

Similar Documents

Publication Publication Date Title
US11842438B2 (en) Method and terminal device for determining occluded area of virtual object
US8749553B1 (en) Systems and methods for accurately plotting mathematical functions
WO2022021695A1 (zh) 图像处理方法、用于图像处理的指令的生成方法及装置
CN111967467A (zh) 图像目标检测方法、装置、电子设备和计算机可读介质
CN112765867B (zh) 一种基于粒子方法的通用光滑边界建模方法
WO2019169699A1 (zh) 房屋模型的渲染方法、装置、终端设备及介质
CN113516246A (zh) 参数优化方法、量子芯片的控制方法及装置
CN115294328A (zh) 目标检测框的生成方法、装置、存储介质和电子设备
CN113986426B (zh) 图像检测方法、装置、可读介质及电子设备
CN113205090B (zh) 图片矫正方法、装置、电子设备及计算机可读存储介质
US20210166053A1 (en) Merging object detections using graphs
CN110956131A (zh) 单目标追踪方法、装置及系统
CN114445825A (zh) 文字检测方法、装置、电子设备和存储介质
CN112508005B (zh) 用于处理图像的方法、装置、设备以及存储介质
CN111080792B (zh) 模型简化处理方法、装置以及电子设备、存储介质
CN116204184B (zh) 一种提高页面风格适配的ui编辑方法、系统及存储介质
US7457788B2 (en) Reducing number of computations in a neural network modeling several data sets
CN116432606A (zh) 方程的讲解步骤的获取方法、装置、设备及存储介质
CN113989376B (zh) 室内深度信息的获取方法、装置和可读存储介质
US11989560B2 (en) Method and device for executing instructions to perform artificial intelligence
CN111382643A (zh) 一种手势检测方法、装置、设备及存储介质
WO2019024723A1 (zh) 特征点匹配结果处理方法和装置
CN112907501A (zh) 物体检测方法、装置及电子设备
CN113762173A (zh) 人脸光流估计及光流值预测模型的训练方法和装置
CN111986300A (zh) 房屋装修的渲染点确定方法和装置、存储介质、电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947620

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022520212

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020947620

Country of ref document: EP

Effective date: 20220321

NENP Non-entry into the national phase

Ref country code: DE