WO2023185234A1 - 图像处理方法、装置、电子设备及存储介质 - Google Patents

图像处理方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2023185234A1
WO2023185234A1 PCT/CN2023/073710 CN2023073710W WO2023185234A1 WO 2023185234 A1 WO2023185234 A1 WO 2023185234A1 CN 2023073710 W CN2023073710 W CN 2023073710W WO 2023185234 A1 WO2023185234 A1 WO 2023185234A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
probability
information
objects
model
Prior art date
Application number
PCT/CN2023/073710
Other languages
English (en)
French (fr)
Inventor
付奎
刘洋
郭明杰
Original Assignee
北京京东乾石科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东乾石科技有限公司 filed Critical 北京京东乾石科技有限公司
Publication of WO2023185234A1 publication Critical patent/WO2023185234A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular to an image processing method, device, electronic equipment and storage medium.
  • supply chain finance is an important business innovation direction in today's logistics field.
  • logistics companies cooperate with banks and trust each other to provide services to merchants in the supply chain field.
  • One of the important services is that merchants mortgage their goods to banks to apply for loans.
  • logistics companies have warehousing advantages, logistics companies provide sites for goods mortgage.
  • an important task of logistics companies is to ensure the safety of the goods, that is, to ensure that the goods are not changed in any way by anyone without permission, such as being moved out of the warehouse. Only by achieving this can we provide credible services to merchants and banks and ensure the smooth operation of supply chain financial services.
  • embodiments of the present application provide an image processing method, device, electronic device, and storage medium.
  • the embodiment of the present application provides an image processing method, including:
  • First information and second information are determined according to the first image and the second image; the first information represents changes in the outer contours of the multiple objects in the first image and the second image; the The second information represents changes in internal textures of the plurality of objects in the first image and the second image;
  • determining the first information based on the first image and the second image includes:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the first information is determined by comparing the third image and the fourth image.
  • determining the first information by comparing the third image and the fourth image includes:
  • each first coefficient represents whether the pixel value of a pixel in the third image is the same as the pixel value in the fourth image. ;
  • the first information is determined.
  • using the plurality of first coefficients to determine the first information includes:
  • the second coefficient represents the matching degree of the third image and the fourth image
  • the first information represents the presence of the multiple objects in the first image and the second image.
  • the outer contour of the object changes; or, in the case where the second coefficient is less than or equal to the first threshold, the first information represents the outer contour of the multiple objects in the first image and the second image. The outline has not changed.
  • the method also includes:
  • the first probability represents the probability that the first model recognizes the object in the input image as an object
  • the second probability The probability represents the probability that the first model recognizes the object in the input image as a non-object
  • the third probability represents the probability that the first model recognizes the non-object in the input image as an object
  • the fourth probability represents The probability that the first model recognizes a non-object in the input image as a non-object
  • the first threshold is determined using the first probability, the second probability, the third probability and the fourth probability.
  • determining the second information based on the first image and the second image includes:
  • the second model uses the second model to binarize the first image to obtain a fifth image; and using the second model to binarize the second image to obtain a sixth image; the second model Use edge detection algorithm training; the pixel value of the pixel corresponding to the edge in the fifth image and the sixth image is the first value, and the pixel value of the pixel corresponding to the non-edge is the second value;
  • the second information is determined using at least the fifth image and the sixth image.
  • using at least the fifth image and the sixth image to determine the second information includes:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the second information is determined using the third image, the fourth image, the fifth image and the sixth image.
  • using the third image, the fourth image, the fifth image and the sixth image to determine the second information includes:
  • the second information is determined by comparing the seventh image and the eighth image.
  • determining the second information by comparing the seventh image and the eighth image includes:
  • each third coefficient represents the matching degree of a first grid and a corresponding second grid
  • the second information is determined.
  • using the plurality of third coefficients to determine the second information includes:
  • the second information represents changes in the internal textures of the plurality of objects in the first image and the second image; or, in each third If all coefficients are less than or equal to the second threshold, the second information represents that the internal textures of the multiple objects in the first image and the second image have not changed.
  • the method also includes:
  • the fifth probability represents the probability that the second model identifies the edge in the input image as an edge
  • the sixth probability The probability represents the probability that the second model identifies the edge in the input image as a non-edge
  • the seventh probability represents the probability that the second model identifies the non-edge in the input image as an edge
  • the eighth probability represents The probability that the second model identifies a non-edge in the input image as a non-edge
  • the second threshold is determined using the fifth probability, the sixth probability, the seventh probability and the eighth probability.
  • determining whether the multiple objects are moved based on the first information and the second information includes:
  • the first information represents changes in the outer contours of the plurality of objects in the first image and the second image
  • the second information represents the location of the plurality of objects.
  • the first information represents that the outer contours of the plurality of objects in the first image and the second image have not changed
  • the second information represents that the outer contours of the plurality of objects in the first image and the second image have not changed. If the internal texture in the second image does not change, it is determined that the multiple objects have not moved.
  • the method also includes:
  • the second information is determined using multiple third coefficients; each third coefficient represents the matching degree of a first grid and the corresponding second grid; the first image corresponds to multiple first grids; The second image corresponds to a plurality of second grids; in the case where the second information represents changes in the internal textures of the multiple objects in the first image and the second image, the alarm information includes At least one grid identifier; each grid identifier corresponds to a third coefficient greater than the second threshold; the at least one grid identifier is used to locate the moved object.
  • the acquisition of the first image and the second image of the target area includes:
  • the ninth image and the tenth image of the first area at least includes the target area; the image acquisition times corresponding to the ninth image and the tenth image are different;
  • At least one second area is determined in the first area; there are a plurality of objects placed in a stacked form in the second area;
  • a target area is determined from the at least one second area, and the ninth image and the tenth image are cropped based on the target area to obtain the first image and the second image.
  • determining at least one second area in the first area based on the ninth image and the tenth image includes:
  • At least one second area is determined in the first area using the ninth image, the tenth image and a third model; the third model is trained using a target detection algorithm.
  • An embodiment of the present application also provides an image processing device, including:
  • a first processing unit configured to acquire a first image and a second image of a target area; there are multiple objects placed in a stacked form in the target area; the image acquisition times corresponding to the first image and the second image are different;
  • a second processing unit configured to determine the first information and the second image based on the first image and the second image.
  • Two information the first information represents changes in the outer contours of the multiple objects in the first image and the second image; the second information represents the changes in the outer contours of the multiple objects in the first image and the second image. Changes in internal texture in the second image;
  • a third processing unit configured to determine whether the plurality of objects are moved based on the first information and the second information.
  • An embodiment of the present application also provides an electronic device, including: a processor and a memory for storing a computer program that can run on the processor,
  • the processor is used to execute the steps of any of the above methods when running the computer program.
  • Embodiments of the present application also provide a storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps of any of the above methods are implemented.
  • the image processing method, device, electronic device and storage medium acquire the first image and the second image of the target area; there are multiple objects placed in a stacked form in the target area; the first image and The image acquisition times corresponding to the second image are different; the first information and the second information are determined according to the first image and the second image; the first information represents the location of the multiple objects in the first image and the second image.
  • the change of the outer contour in the image; the second information represents the change of the internal texture of the multiple objects in the first image and the second image; according to the first information and the second information, determine Whether the multiple objects are moved.
  • the solution provided by the embodiment of the present application is to identify the changes in the position of the objects from both the overall and local perspectives based on images collected at different times in a target area where there are multiple objects placed in a stacked form.
  • the changes in the outer contours of multiple objects in the image (i.e. overall) and the changes in the internal texture of multiple objects in the image (i.e. local) determine whether multiple objects have been moved; in this way, computer vision technology (i.e.
  • Figure 1 is a schematic flow chart of an image processing method according to an embodiment of the present application.
  • Figure 2 is a schematic diagram of the outline information and texture information of goods (i.e. objects) according to the application embodiment of this application;
  • Figure 3 is a schematic diagram of cargo area detection according to the application embodiment of this application.
  • Figure 4 is a schematic diagram of cargo area division according to the application embodiment of this application.
  • Figure 5 is a schematic diagram of cargo texture detection according to the application embodiment of the present application.
  • Figure 6 is a schematic structural diagram of an image processing device according to an embodiment of the present application.
  • Figure 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • manual inspection is usually used to determine whether an object has been moved, for example, manual inspection is used to monitor goods in a warehouse. This method will cause a waste of human resources, and it is difficult for the human eye to detect some subtle object movement events.
  • changes in the position of the objects are identified from both global and local perspectives based on images collected at different times.
  • whether multiple objects are moved is determined based on the changes in the outer contours of multiple objects in the image (i.e., overall) and the changes in the internal texture of multiple objects in the image (i.e., locally); in this way, it can be Computer vision technology (that is, processing images of the target area) monitors the status of objects to intelligently determine whether the object has been moved, avoiding the waste of human resources caused by manual inspections; and, compared with manual inspections, by It can identify changes in the position of objects from both global and local angles, and can identify subtle object movement events that are difficult to detect with the human eye, improving the accuracy of judging whether an object has been moved.
  • Computer vision technology that is, processing images of the target area
  • An embodiment of the present application provides an image processing method, which is applied to electronic devices (such as servers). As shown in Figure 1, the method includes:
  • Step 101 Obtain the first image and the second image of the target area
  • the first image and the The image acquisition times corresponding to the two images are different;
  • Step 102 Determine first information and second information based on the first image and second image
  • the first information represents changes in the outer contours of the multiple objects in the first image and the second image
  • the second information represents the changes in the outer contours of the multiple objects in the first image and the second image.
  • Step 103 Determine whether the multiple objects are moved according to the first information and the second information.
  • the first image corresponds to the first image collection time
  • the second image corresponds to the second image collection time. It can be understood that determining whether the multiple objects are moved refers to determining whether the multiple objects are moved within the time range from the first image collection time to the second image collection time.
  • the multiple objects can be regarded as a whole.
  • the multiple objects are The outer contours in the first image and the second image change; when the internal objects among the multiple objects are moved, the internal textures of the multiple objects in the first image and the second image changes occur.
  • the internal texture can be understood as information such as the outline of the internal object among the multiple objects and the pattern of the outer packaging.
  • obtaining the first image and the second image of the target area may include: obtaining the first image and the second image of the target area collected by the image acquisition device from the image acquisition device, so The position and image collection angle of the image collection device are fixed.
  • the image acquisition device can acquire an image containing multiple piles of objects.
  • the electronic device can acquire an image containing multiple piles of objects from the image acquisition device, and detect the location of each pile of objects from the image. area, the step 101 to the step 103 are performed for each detected area, thereby determining whether there is a moved object in each pile of objects.
  • acquiring the first image and the second image of the target area may include:
  • the ninth image and the tenth image of the first area at least includes the target area; the image acquisition times corresponding to the ninth image and the tenth image are different;
  • At least one second area is determined in the first area; there are a plurality of objects placed in a stacked form in the second area;
  • a target area is determined from the at least one second area, and the ninth image and the tenth image are cropped based on the target area to obtain the first image and the second image.
  • the electronic device may acquire the ninth image and the tenth image collected by the image acquisition device from the image acquisition device.
  • the ninth image corresponds to the first image acquisition time
  • the tenth image Corresponds to the second image acquisition time.
  • the position of the image acquisition device may be fixed or unfixed, and the image acquisition angle of the image acquisition device may also be fixed or unfixed, which can be set according to needs. The application examples do not limit this.
  • the ninth image and the tenth image need to be processed through image comparison or other methods to achieve the first image acquisition.
  • the objects in the ninth image can correspond to the objects in the tenth image.
  • a pre-trained model can be used to determine each pile of objects in the first area, that is, at least one second area is determined in the first area.
  • determining at least one second area in the first area according to the ninth image and the tenth image may include:
  • At least one second area is determined in the first area using the ninth image, the tenth image and a third model; the third model is trained using a target detection algorithm.
  • the target detection algorithm may include yolo_v5, faster-rcnn, centerNet, etc., and may be specifically set according to requirements, which is not limited in the embodiments of this application.
  • the third model needs to be trained in advance.
  • a training data set may be determined, and the training data set may include a preset area collected by the image acquisition device (which may be To obtain settings, the area needs to have a predetermined number (for example, 2000) images of multiple objects placed in a stacked form, frame each pile of objects in each image, and record (i.e., label) the objects corresponding to each pile of objects. coordinate information; after completing the annotation, use the annotation data and the target detection algorithm to train the third model.
  • a predetermined number for example, 2000
  • the second area may be a rectangle
  • the third model is used to detect the rectangular area occupied by each pile of objects from the image acquired through the image acquisition device, which can ensure that the subsequent image processing process is not blocked. External information interference unrelated to the object.
  • a pre-trained model can be used to process the first image and the second image to determine the first information.
  • determining the first information based on the first image and the second image may include:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are (That is, the pixel value corresponding to the non-object) is the second value;
  • the first information is determined by comparing the third image and the fourth image.
  • the first value may be 1 and the second value may be 0.
  • the semantic segmentation algorithm may include deeplab_v3, U-net, etc., and may be specifically set according to requirements, which is not limited in the embodiments of this application.
  • the first model needs to be trained in advance.
  • a training data set can be determined, and the training data set can include a predetermined value (which can be set according to needs of a preset area collected by the image acquisition device, and the area needs to contain multiple objects placed in a stacked form).
  • a predetermined value which can be set according to needs of a preset area collected by the image acquisition device, and the area needs to contain multiple objects placed in a stacked form.
  • the image of 2000 (which can be the same as the image used to train the third model), mark the coordinate position of the outer contour of each pile of objects in each image; use the marked data and the semantic segmentation algorithm
  • the first model is trained.
  • determining the first information by comparing the third image and the fourth image may include:
  • each first coefficient represents whether the pixel value of a pixel in the third image is the same as the pixel value in the fourth image. ;
  • the first information is determined.
  • the specific method of determining the first coefficient can be set according to requirements. For example, when the pixel value of a pixel in the third image is the same as the pixel value in the fourth image, the first coefficient may be equal to 0; when a pixel in the third image The first coefficient may be equal to 1 when the pixel value is different from the pixel value in the fourth image.
  • using the plurality of first coefficients to determine the first information may include:
  • the second coefficient represents the matching degree of the third image and the fourth image
  • the first information represents the presence of the multiple objects in the first image and the second image.
  • the outer contour of the object changes; or, in the case where the second coefficient is less than or equal to the first threshold, the first information represents the outer contour of the multiple objects in the first image and the second image. The outline has not changed.
  • the specific method of calculating the second coefficient can be set according to requirements. It's understandable, The larger the second coefficient is, the lower the degree of matching between the third image and the fourth image.
  • the first threshold can be determined by counting the effect of the first model on a preset verification data set.
  • the method may further include:
  • the first probability represents that the first model will input an image (such as the first image and the second image ) in the probability that an object is recognized as an object, that is, the probability that the pixel value of the pixel corresponding to the object is determined to be the first value during the binarization process of the input image;
  • the second probability represents the first value The probability that the model recognizes the object in the input image as a non-object, that is, the probability that the pixel value of the pixel corresponding to the object is determined as the second value during the binarization process of the input image;
  • the third probability represents The probability that the first model identifies a non-object in the input image as an object is the probability that the pixel value of the pixel corresponding to the non-object is determined as the first value during the binarization process of the input image; so
  • the fourth probability represents the probability that the first model recognizes the non-object in the input image as a non-object, that is, in the process of binar
  • the first threshold is determined using the first probability, the second probability, the third probability and the fourth probability.
  • the first probability, the second probability, the third probability and the fourth probability can be determined by counting the effect of the first model on a preset verification data set.
  • the specific method of determining the first threshold can be set according to requirements.
  • the third probability can be determined by combining Bernoulli distribution, binomial distribution, central limit theorem, Gaussian distribution, 3 ⁇ principle and other ideas. a threshold.
  • a pre-trained model can be used to process the first image and the second image to determine the second information.
  • determining the second information based on the first image and the second image may include:
  • the second model binarizes the first image to obtain a fifth image; and use the The second model binarizes the second image to obtain a sixth image; the second model is trained using an edge detection algorithm; the pixels corresponding to the edges in the fifth image and the sixth image are The pixel value is the first value, and the pixel value of the pixel corresponding to the non-edge is the second value;
  • the second information is determined using at least the fifth image and the sixth image.
  • the first value may be 1, and the second value may be 0.
  • the edge detection algorithm may include PiDiNet, etc., and may be specifically set according to requirements, which is not limited in the embodiments of this application.
  • the second model may be pre-trained, or the second model may be an open source model.
  • the third image, the fourth image, the fifth image and the sixth image can be used to determine the second information.
  • determining the second information using at least the fifth image and the sixth image may include:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the second information is determined using the third image, the fourth image, the fifth image and the sixth image.
  • determining the second information using the third image, the fourth image, the fifth image and the sixth image may include:
  • the second information is determined by comparing the seventh image and the eighth image.
  • the third image and the fifth image are multiplied by each other, and the fourth image and the The sixth image alignment multiplication can eliminate the interference of external information irrelevant to the object.
  • the seventh image and the eighth image do not contain external information irrelevant to the object.
  • By multiplying the Comparing the seventh image with the eighth image to determine the second information can ensure that the subsequent image processing process will not be interfered by external information unrelated to the object, further improving the accuracy of the judgment results.
  • the seventh image and the eighth image can be divided into rasters first, and then the rasters are used as grids.
  • the unit compares the grid of the seventh image with the grid of the eighth image to determine local texture changes in multiple objects placed in a stacked form.
  • determining the second information by comparing the seventh image and the eighth image includes:
  • each third coefficient represents the matching degree of a first grid and a corresponding second grid
  • the second information is determined.
  • the preset rules can be set according to needs.
  • the preset rules may include: dividing the image into H ⁇ W grids, H and W are both integers greater than 0, H and W may be the same or different, and the specific values of H and W may be based on Experience setting.
  • using the plurality of third coefficients to determine the second information may include:
  • the second information represents changes in the internal textures of the plurality of objects in the first image and the second image; or, in each third If all coefficients are less than or equal to the second threshold, the second information represents that the internal textures of the multiple objects in the first image and the second image have not changed.
  • the second threshold can be determined by counting the effect of the second model on a preset verification data set.
  • the method may further include:
  • the fifth probability represents that the second model will input an image (such as the first image and the second image ) in the probability that the edge is identified as an edge, that is, the probability that the pixel value of the pixel corresponding to the edge is determined to be the first value during the binarization process of the input image;
  • the sixth probability represents the second The probability that the model identifies an edge in the input image as a non-edge, that is, the probability that the pixel value of the pixel corresponding to the edge is determined as the second value during the binarization process of the input image;
  • the seventh probability represents The probability that the second model identifies a non-edge in the input image as an edge is the probability that the pixel value of the pixel corresponding to the non-edge is determined as the first value during the binarization process of the input image; so
  • the eighth probability represents the probability that the second model identifies the non-edge in the input image as a non-edge, that is, in the process of bin
  • the second threshold is determined using the fifth probability, the sixth probability, the seventh probability and the eighth probability.
  • the fifth probability, the sixth probability, the seventh probability and the eighth probability can be determined by counting the effect of the second model on a preset verification data set.
  • the specific method of determining the second threshold can be set according to requirements.
  • the third probability can be determined by combining Bernoulli distribution, binomial distribution, central limit theorem, Gaussian distribution, 3 ⁇ principle and other ideas. Two thresholds.
  • determining whether the multiple objects are moved based on the first information and the second information may include:
  • the first information represents changes in the outer contours of the plurality of objects in the first image and the second image
  • the second information represents the location of the plurality of objects.
  • the first information represents the outer contours of the plurality of objects in the first image and the second image. If there is no change, and the second information represents that the internal textures of the plurality of objects in the first image and the second image have not changed, it is determined that the plurality of objects have not been moved.
  • an alarm message may be sent to the target device to remind the user that there is a moved object in the target area.
  • the method may further include:
  • the second information is determined using multiple third coefficients; each third coefficient represents the matching degree of a first grid and the corresponding second grid; the first image corresponds to multiple first grids; The second image corresponds to a plurality of second grids; in the case where the second information represents changes in the internal textures of the multiple objects in the first image and the second image, the alarm information includes At least one grid identifier; each grid identifier corresponds to a third coefficient greater than the second threshold; the at least one grid identifier is used to locate the moved object.
  • the specific sending object of the alarm information ie, the target device
  • the specific sending object of the alarm information can be set according to requirements, which is not limited in the embodiments of the present application.
  • the ninth image can be used as the original state image, that is, the ninth image can reflect the original state of the object (such as the state when it is put into storage); the tenth image can be used as the current state image, that is, the tenth image can be used as the current state image, that is, the tenth image can be used as the current state image. Images can reflect the current state of an object.
  • the ninth image can be updated according to the business status corresponding to the multiple objects (such as new goods entering the warehouse or goods leaving the warehouse), and the tenth image can be updated periodically or triggered.
  • the periodic The sexual update may include the image acquisition device collecting the tenth image of the first area in a preset period (which can be set according to requirements, such as n seconds, n is an integer greater than 0) and sending it to the electronic device;
  • the triggered update may include the electronic device acquiring a tenth image from the image acquisition device when receiving a detection instruction from another device (such as a terminal).
  • the image processing method provided by the embodiment of the present application acquires the first image and the second image of the target area; there are multiple objects placed in a stacked form in the target area; the image acquisition time corresponding to the first image and the second image different; according to the first image and the second image, determine the first information and the second information information; the first information represents changes in the outer contours of the multiple objects in the first image and the second image; the second information represents the changes in the outer contours of the multiple objects in the first image and the second image. Changes in the internal texture in the second image; determine whether the multiple objects have been moved based on the first information and the second information.
  • the solution provided by the embodiment of the present application is to identify the changes in the position of the objects from both the overall and local perspectives based on images collected at different times in a target area where there are multiple objects placed in a stacked form.
  • the changes in the outer contours of multiple objects in the image (i.e. overall) and the changes in the internal texture of multiple objects in the image (i.e. local) determine whether multiple objects have been moved; in this way, computer vision technology (i.e.
  • This application embodiment provides a computer vision-based warehouse cargo (i.e., the above-mentioned objects) care solution, which identifies changes in the cargo from both overall and local perspectives.
  • the change of the overall angle of the goods is the change of the outer contour
  • the change of the local angle is some texture changes of the corresponding area of the goods on the image.
  • Goods are generally stored in stacks in warehouses. If the goods at the edge of the cargo area in the image are moved, the contour information of the corresponding position will change (i.e., the above-mentioned first information). The contour information is shown in box 201 in Figure 2 shown. If the goods within the cargo area in the image are moved, the texture information at the corresponding position will change (ie, the above-mentioned second information). The texture information is shown in box 202 in Figure 2 .
  • the warehouse cargo care solution includes the following steps:
  • Step 1 Use the area detection model (i.e. the above-mentioned third model) to detect the cargo area;
  • Step 2 Use the segmentation model (i.e. the above-mentioned first model) to segment the cargo area;
  • Step 3 Use the edge detection model (i.e. the above-mentioned second model) to detect the cargo texture;
  • Step 4 Final match.
  • the function of the area detection model is to detect the rectangular area occupied by the goods (ie, the above-mentioned third image) from the images (such as the above-mentioned ninth image and tenth image) acquired by the surveillance camera (ie, the above-mentioned image acquisition device). Second area) to ensure that the subsequent process is not interfered by external information unrelated to the goods.
  • the specific effect is shown in Figure 3.
  • Each pile of goods will be framed to obtain a corresponding rectangular frame (such as rectangular frame 301).
  • the area detection model uses the yolo_v5 algorithm to perform area detection.
  • the final regional detection model can detect each pile of goods in the newly collected images and calibrate the corresponding rectangular frame.
  • the function of the segmentation model is to obtain the outer contour of each pile of goods to determine whether the contour information of the goods has changed.
  • the segmentation model uses the deeplab_v3 algorithm for image processing.
  • the output of the segmentation model (as shown in Figure 4) is a matrix that is the same size as the input image (such as the third image and the fourth image above).
  • the matrix corresponds to the pixels of the goods.
  • the value of the position is 1 (that is, the first value mentioned above), and the value of the remaining part (that is, the non-cargo part) is 0 (that is, the second value mentioned above).
  • the method of obtaining the marked data may include: selecting about 2000 pictures detected by the camera in the warehouse (which can be used in the training of the area detection model picture), mark the coordinate position of the outer contour of each pile of goods in the picture, and train the segmentation model based on the marks. After the training is completed, the segmentation model will analyze the new input image and generate a 0,1 matrix of the same size as the input image, in which the area corresponding to the goods is 1 and the rest is 0.
  • the function of the edge detection model is to identify the local texture in the input image (such as the above-mentioned first image and the second image) to determine whether the texture information has changed from the original.
  • the edge detection model uses PiDiNet to extract important textures by identifying areas where mutations occur within the image.
  • the output image of the edge detection model is shown in Figure 5, which is a 0,1 matrix of the same size as the original image, and the pixels corresponding to the edges.
  • the position value is 1 (i.e. the first value mentioned above), and the remaining positions are 0 (i.e. the second value mentioned above).
  • edge detection is a general algorithm and is not limited to edge detection of goods
  • the edge detection model can be an open source model and does not need to be retrained.
  • the three steps of cargo area detection, cargo area segmentation and cargo texture detection all serve the final identification (that is, final matching) process.
  • the camera used to monitor the goods must be used to obtain an original picture of the goods.
  • subsequent pictures of the goods must be periodically collected. Each time a subsequent picture is obtained, it must be compared with the original picture to determine whether the goods are Movement occurs.
  • the original picture of the goods and the subsequent pictures need to go through the three steps of cargo area detection, cargo area segmentation and cargo texture detection before final comparison.
  • the position and angle of surveillance cameras are usually fixed. Therefore, only the cargo area detection can be performed on the original cargo image, and the obtained detection frame is also used on subsequent cargo images.
  • both the original picture of the goods and the subsequently collected pictures of the goods need to undergo cargo area segmentation and cargo texture detection.
  • the original picture can be updated.
  • the comparison process of contour information includes the following steps:
  • (x,y) represents the coordinates of a position in the graph
  • Sg(i)(x,y) and Dg(i)(x,y) represent two segmentation graphs (i.e. Sg(i) and Dg(i) respectively. )) corresponding to the value of the position
  • the comparison process of texture information includes the following steps:
  • H and W are integers greater than 0, H and W can be the same or different) grids (i.e. the above-mentioned grids) grid), the corresponding parts of each grid are expressed as St(i)(h,w) and Dt(i)(h,w) respectively; H and W can be set based on experience.
  • the reason for dividing the grid first and then calculating the difference coefficient is that the movement may only occur in a certain part. If the entire picture is compared directly, the difference obtained will be relatively small, making it difficult to accurately judge whether the goods have moved; while dividing Grids can effectively solve this problem. In addition, meshing can also help locate the specific location where movement occurs.
  • the comparison of contours and textures requires comparing the difference coefficient with the corresponding threshold.
  • the setting of threshold is usually a difficult problem. Therefore, in this application embodiment, statistical means are used to derive the threshold value to help make more accurate judgments.
  • the derivation process of the threshold i.e., the first threshold
  • the threshold value of the difference coefficient (first threshold value) is set to:
  • the derivation process of the threshold i.e., the second threshold
  • the second threshold should be set to:
  • Model training Perform data marking according to preset rules, train area detection models and segmentation models, and obtain edge detection models.
  • the verification data set is used to estimate the corresponding probability values p TT , p TF , p FT , p FF and q TT , q TF , q FT , q FF of the segmentation model and edge detection model respectively.
  • Cargo area division is performed on the original cargo image to obtain the corresponding segmentation map. When the cargo status is not updated, this operation only needs to be performed once; cargo area segmentation is performed on periodically collected cargo images to obtain the corresponding segmentation map. Each time This must be done every time a new picture is collected.
  • the solution provided by this application embodiment uses computer vision technology to monitor the goods in the warehouse using surveillance cameras to determine whether they have been moved; and uses image segmentation and edge detection technology to obtain the outline and texture information of the goods. Based on these two pieces of information, the status (i.e., position) of the cargo is compared to determine whether the cargo has been moved from the overall and local dimensions. In addition, statistical methods are also used to derive the threshold of the difference between the original state and the post-collection state of the cargo to help determine whether the cargo has been moved. Make a more precise judgment whether the goods have been moved.
  • the solution provided by this application embodiment is based on computer vision technology to monitor the status (i.e., location) of goods in the warehouse. If the status of the goods (i.e., location change) is found to change, an alarm will be issued in time and manual verification will be applied. Since there are generally a large number of surveillance cameras in warehouses, this method can make full use of existing resources and effectively reduce the consumption of manpower. At the same time, computer vision can also identify subtle changes that are difficult to detect with the human eye.
  • the embodiment of the present application also provides an image processing device, Set on electronic equipment (such as installed on a server), as shown in Figure 6, the device includes:
  • the first processing unit 601 is configured to acquire a first image and a second image of a target area; there are multiple objects placed in a stacked form in the target area; the image acquisition times corresponding to the first image and the second image are different;
  • the second processing unit 602 is configured to determine first information and second information according to the first image and the second image; the first information represents the presence of the plurality of objects in the first image and the second image. The change of the outer contour; the second information represents the change of the internal texture of the multiple objects in the first image and the second image;
  • the third processing unit 603 is configured to determine whether the plurality of objects are moved according to the first information and the second information.
  • the second processing unit 602 is further configured to:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the first information is determined by comparing the third image and the fourth image.
  • the second processing unit 602 is further configured to:
  • each first coefficient represents whether the pixel value of a pixel in the third image is the same as the pixel value in the fourth image. ;
  • the first information is determined.
  • the second processing unit 602 is further configured to:
  • the second coefficient represents the matching degree of the third image and the fourth image
  • the first information represents the presence of the multiple objects in the first image and the second image.
  • the outer contour changes; or, in the case where the second coefficient is less than or equal to the first threshold, the The first information represents that the outer contours of the plurality of objects in the first image and the second image have not changed.
  • the second processing unit 602 is further configured to:
  • the first probability represents the probability that the first model recognizes the object in the input image as an object
  • the second probability The probability represents the probability that the first model recognizes the object in the input image as a non-object
  • the third probability represents the probability that the first model recognizes the non-object in the input image as an object
  • the fourth probability represents The probability that the first model recognizes a non-object in the input image as a non-object
  • the first threshold is determined using the first probability, the second probability, the third probability and the fourth probability.
  • the second processing unit 602 is further configured to:
  • the second model uses the second model to binarize the first image to obtain a fifth image; and using the second model to binarize the second image to obtain a sixth image; the second model Use edge detection algorithm training; the pixel value of the pixel corresponding to the edge in the fifth image and the sixth image is the first value, and the pixel value of the pixel corresponding to the non-edge is the second value;
  • the second information is determined using at least the fifth image and the sixth image.
  • the second processing unit 602 is further configured to:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the second information is determined using the third image, the fourth image, the fifth image and the sixth image.
  • the second processing unit 602 is further configured to:
  • the second information is determined by comparing the seventh image and the eighth image.
  • the second processing unit 602 is further configured to:
  • each third coefficient represents the matching degree of a first grid and a corresponding second grid
  • the second information is determined.
  • the second processing unit 602 is further configured to:
  • the second information represents changes in the internal textures of the plurality of objects in the first image and the second image; or, in each third If all coefficients are less than or equal to the second threshold, the second information represents that the internal textures of the multiple objects in the first image and the second image have not changed.
  • the second processing unit 602 is further configured to:
  • the fifth probability represents the probability that the second model identifies the edge in the input image as an edge
  • the sixth probability The probability represents the probability that the second model identifies the edge in the input image as a non-edge
  • the seventh probability represents the probability that the second model identifies the non-edge in the input image as an edge
  • the eighth probability represents The probability that the second model identifies a non-edge in the input image as a non-edge
  • the second threshold is determined using the fifth probability, the sixth probability, the seventh probability and the eighth probability.
  • the third processing unit 603 is further configured to:
  • the first information represents changes in the outer contours of the plurality of objects in the first image and the second image
  • the second information represents the location of the plurality of objects.
  • the first information represents that the outer contours of the plurality of objects in the first image and the second image have not changed
  • the second information represents that the outer contours of the plurality of objects in the first image and the second image have not changed. in two images If the internal texture of the object does not change, it is determined that the multiple objects have not been moved.
  • the device further includes a communication unit; the third processing unit 603 is further configured to send an alarm message through the communication module when it is determined that at least one object among the plurality of objects is moved; wherein,
  • the second information is determined using multiple third coefficients; each third coefficient represents the matching degree of a first grid and the corresponding second grid; the first image corresponds to multiple first grids; The second image corresponds to a plurality of second grids; in the case where the second information represents changes in the internal textures of the multiple objects in the first image and the second image, the alarm information includes At least one grid identifier; each grid identifier corresponds to a third coefficient greater than the second threshold; the at least one grid identifier is used to locate the moved object.
  • the first processing unit 601 is further configured to:
  • the ninth image and the tenth image of the first area at least includes the target area; the image acquisition times corresponding to the ninth image and the tenth image are different;
  • At least one second area is determined in the first area; there are a plurality of objects placed in a stacked form in the second area;
  • a target area is determined from the at least one second area, and the ninth image and the tenth image are cropped based on the target area to obtain the first image and the second image.
  • the first processing unit 601 is further configured to determine at least one second area in the first area using the ninth image, the tenth image and the third model;
  • the third model is trained using an object detection algorithm.
  • the communication unit can be implemented by a communication interface in an image processing device; the first processing unit 601, the second processing unit 602 and the third processing unit 603 can be implemented by a processor in an image processing device. .
  • the image processing device provided in the above embodiments processes images
  • only the division of the above program modules is used as an example.
  • the above processing can be allocated to different program modules according to needs, that is, The internal structure of the device is divided into different program modules to complete all or part of the processing described above.
  • the image processing device and image processing provided by the above embodiments The image processing method embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be described again here.
  • the embodiment of the present application also provides an electronic device.
  • the electronic device 700 includes:
  • Communication interface 701 is capable of information interaction with other electronic devices
  • the processor 702 is connected to the communication interface 701 to implement information interaction with other electronic devices, and is used to execute the method provided by one or more of the above technical solutions when running a computer program;
  • Memory 703 stores computer programs that can run on the processor 702 .
  • the processor 702 is configured as:
  • First information and second information are determined according to the first image and the second image; the first information represents changes in the outer contours of the multiple objects in the first image and the second image; the The second information represents changes in internal textures of the plurality of objects in the first image and the second image;
  • the processor 702 is further configured to:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the first information is determined by comparing the third image and the fourth image.
  • the processor 702 is further configured to:
  • each first coefficient represents whether the pixel value of a pixel in the third image is the same as the pixel value in the fourth image. ;
  • the first information is determined.
  • the processor 702 is further configured to:
  • the second coefficient represents the matching degree of the third image and the fourth image
  • the first information represents the presence of the multiple objects in the first image and the second image.
  • the outer contour of the object changes; or, in the case where the second coefficient is less than or equal to the first threshold, the first information represents the outer contour of the multiple objects in the first image and the second image. The outline has not changed.
  • the processor 702 is further configured to:
  • the first probability represents the probability that the first model recognizes the object in the input image as an object
  • the second probability The probability represents the probability that the first model recognizes the object in the input image as a non-object
  • the third probability represents the probability that the first model recognizes the non-object in the input image as an object
  • the fourth probability represents The probability that the first model recognizes a non-object in the input image as a non-object
  • the first threshold is determined using the first probability, the second probability, the third probability and the fourth probability.
  • the processor 702 is further configured to:
  • the second model uses the second model to binarize the first image to obtain a fifth image; and using the second model to binarize the second image to obtain a sixth image; the second model Use edge detection algorithm training; the pixel value of the pixel corresponding to the edge in the fifth image and the sixth image is the first value, and the pixel value of the pixel corresponding to the non-edge is the second value;
  • the second information is determined using at least the fifth image and the sixth image.
  • the processor 702 is further configured to:
  • the first model uses semantic segmentation algorithm training;
  • the pixel values of the pixels corresponding to the plurality of objects in the third image and the fourth image are the first values, and other pixels except the pixels corresponding to the plurality of objects are The pixel value is the second value;
  • the second information Using the third image, the fourth image, the fifth image and the sixth image, it is determined the second information.
  • the processor 702 is further configured to:
  • the second information is determined by comparing the seventh image and the eighth image.
  • the processor 702 is further configured to:
  • each third coefficient represents the matching degree of a first grid and a corresponding second grid
  • the second information is determined.
  • the processor 702 is further configured to:
  • the second information represents changes in the internal textures of the plurality of objects in the first image and the second image; or, in each third If all coefficients are less than or equal to the second threshold, the second information represents that the internal textures of the multiple objects in the first image and the second image have not changed.
  • the processor 702 is further configured to:
  • the fifth probability represents the probability that the second model identifies the edge in the input image as an edge
  • the sixth probability The probability represents the probability that the second model identifies the edge in the input image as a non-edge
  • the seventh probability represents the probability that the second model identifies the non-edge in the input image as an edge
  • the eighth probability represents The probability that the second model identifies a non-edge in the input image as a non-edge
  • the second threshold is determined using the fifth probability, the sixth probability, the seventh probability and the eighth probability.
  • the processor 702 is further configured to:
  • first information represents changes in the outer contours of the plurality of objects in the first image and the second image
  • second information represents the location of the plurality of objects.
  • the first information represents that the outer contours of the plurality of objects in the first image and the second image have not changed
  • the second information represents that the outer contours of the plurality of objects in the first image and the second image have not changed. If the internal texture in the second image does not change, it is determined that the multiple objects have not moved.
  • the processor 702 is further configured to send an alarm message through the communication interface 701 when it determines that at least one object among the plurality of objects is moved; wherein,
  • the second information is determined using multiple third coefficients; each third coefficient represents the matching degree of a first grid and the corresponding second grid; the first image corresponds to multiple first grids; The second image corresponds to a plurality of second grids; in the case where the second information represents changes in the internal textures of the multiple objects in the first image and the second image, the alarm information includes At least one grid identifier; each grid identifier corresponds to a third coefficient greater than the second threshold; the at least one grid identifier is used to locate the moved object.
  • the processor 702 is further configured to:
  • the ninth image and the tenth image of the first area at least includes the target area; the image acquisition times corresponding to the ninth image and the tenth image are different;
  • At least one second area is determined in the first area; there are a plurality of objects placed in a stacked form in the second area;
  • a target area is determined from the at least one second area, and the ninth image and the tenth image are cropped based on the target area to obtain the first image and the second image.
  • the processor 702 is further configured to determine at least one second area in the first area using the ninth image, the tenth image and a third model; the third The model is trained using an object detection algorithm.
  • bus system 704. various components in the electronic device 700 are coupled to each other through the bus system 704. Together. It can be understood that the bus system 704 is used to implement connection communication between these components. In addition to the data bus, the bus system 704 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, the various buses are labeled bus system 704 in FIG. 7 .
  • the memory 703 in the embodiment of the present application is used to store various types of data to support the operation of the electronic device 700 .
  • Examples of such data include any computer program for operating on electronic device 700 .
  • the methods disclosed in the above embodiments of the present application can be applied to the processor 702 or implemented by the processor 702 .
  • the processor 702 may be an integrated circuit chip with signal processing capabilities. During the implementation process, each step of the above method can be completed by instructions in the form of hardware integrated logic circuits or software in the processor 702 .
  • the above-mentioned processor 702 may be a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP Digital Signal Processor
  • the processor 702 can implement or execute each method, step, and logical block diagram disclosed in the embodiment of this application.
  • a general-purpose processor may be a microprocessor or any conventional processor, etc.
  • the steps of the method disclosed in the embodiments of this application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium, and the storage medium is located in the memory 703.
  • the processor 702 reads the information in the memory 703, and completes the steps of the foregoing method in combination with its hardware.
  • the electronic device 700 may be configured by one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs) , Complex Programmable Logic Device), Field-Programmable Gate Array (FPGA, Field-Programmable Gate Array), general-purpose processor, controller, microcontroller (MCU, Micro Controller Unit), microprocessor (Microprocessor), or other electronics Component implementation, used to execute the aforementioned methods.
  • ASICs Application Specific Integrated Circuits
  • DSPs Programmable Logic Devices
  • CPLDs Complex Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • MCU microcontroller
  • Microcontroller Micro Controller Unit
  • Microprocessor Microprocessor
  • the memory 703 in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
  • the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read-Only Memory), electrically erasable programmable read-only memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), magnetic random access memory (FRAM, ferromagnetic random access memory), flash memory (Flash Memory) , magnetic surface memory, optical disk, or CD-ROM (Compact Disc Read-Only Memory); the magnetic surface memory can be a magnetic disk memory or a tape memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM Static Random Access Memory
  • SSRAM Synchronous Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM enhanced Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM SyncLink Dynamic Random Access Memory
  • DRRAM Direct Rambus Random Access Memory
  • the embodiment of the present application also provides a storage medium, that is, a computer storage medium, specifically a computer-readable storage medium, such as a memory 703 that stores a computer program.
  • the computer program can be processed by the electronic device 700
  • the processor 702 is executed to complete the steps described in the foregoing method.
  • the computer-readable storage medium can be FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM and other memories.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种图像处理方法、装置、电子设备及存储介质,涉及人工智能技术领域。其中,方法包括:获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;根据所述第一信息和第二信息,确定所述多个物体是否被移动。

Description

图像处理方法、装置、电子设备及存储介质
相关申请的交叉引用
本申请基于申请号为202210350472.4、申请日为2022年04月02日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种图像处理方法、装置、电子设备及存储介质。
背景技术
在一些业务场景中,需要判断物体是否被移动。举例来说,供应链金融是当今物流领域的重要业务创新方向,在供应链金融业务中,物流企业与银行合作并互相采信,为供应链领域的商家提供服务。其中一项重要的服务为,商家将自己的货物抵押给银行来申请贷款,而由于物流企业具有仓储方面的优势,则由物流企业来提供货物抵押的场地。在这一过程中,物流企业的一项重要的工作就是保证货物的安全,即保证在未经允许的情况下,货物不被任何人以任何形式改变状态,如移出仓库。只有做到这一点才能够给商家和银行提供可信的服务,保证供应链金融业务的顺利运行。
然而,相关技术中,对于如何智能地判断物体是否被移动尚未有有效解决方案。
发明内容
为解决相关技术问题,本申请实施例提供一种图像处理方法、装置、电子设备及存储介质。
本申请实施例的技术方案是这样实现的:
本申请实施例提供了一种图像处理方法,包括:
获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
根据所述第一信息和第二信息,确定所述多个物体是否被移动。
上述方案中,所述根据所述第一图像和第二图像,确定第一信息,包括:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
通过对所述第三图像和所述第四图像进行对比,确定所述第一信息。
上述方案中,所述通过对所述第三图像和所述第四图像进行对比,确定所述第一信息,包括:
根据所述第三图像和所述第四图像,确定多个第一系数;每个第一系数表征一个像素点在所述第三图像的像素值是否与在所述第四图像的像素值相同;
利用所述多个第一系数,确定所述第一信息。
上述方案中,所述利用所述多个第一系数,确定所述第一信息,包括:
利用所述多个第一系数,确定第二系数;所述第二系数表征所述第三图像和所述第四图像的匹配程度;
判断所述第二系数是否大于第一阈值;在所述第二系数大于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;或者,在所述第二系数小于或等于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化。
上述方案中,所述方法还包括:
确定所述第一模型的第一概率、第二概率、第三概率和第四概率;所述第一概率表征所述第一模型将输入图像中的物体识别为物体的概率;所述第二概率表征所述第一模型将输入图像中的物体识别为非物体的概率;所述第三概率表征所述第一模型将输入图像中的非物体识别为物体的概率;所述第四概率表征所述第一模型将输入图像中的非物体识别为非物体的概率;
利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值。
上述方案中,所述根据所述第一图像和第二图像,确定第二信息,包括:
利用第二模型对所述第一图像进行二值化处理,得到第五图像;并利用所述第二模型对所述第二图像进行二值化处理,得到第六图像;所述第二模型利用边缘检测算法训练;所述第五图像和所述第六图像中边缘对应的像素点的像素值为第一值,非边缘对应的像素点的像素值为第二值;
至少利用所述第五图像和所述第六图像,确定所述第二信息。
上述方案中,所述至少利用所述第五图像和所述第六图像,确定所述第二信息,包括:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息。
上述方案中,所述利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息,包括:
将所述第三图像和所述第五图像对位相乘,得到第七图像;并将所述第四图像和所述第六图像对位相乘,得到第八图像;
通过对所述第七图像和所述第八图像进行对比,确定所述第二信息。
上述方案中,所述通过对所述第七图像和所述第八图像进行对比,确定所述第二信息,包括:
基于预设规则对所述第七图像进行划分,得到多个第一栅格;并基于所述预设规则对所述第八图像进行划分,得到多个第二栅格;
根据所述多个第一栅格和所述多个第二栅格,确定多个第三系数;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;
利用所述多个第三系数,确定所述第二信息。
上述方案中,所述利用所述多个第三系数,确定所述第二信息,包括:
判断每个第三系数是否大于第二阈值;
在存在第三系数大于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化;或者,在每个第三系数均小于或等于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化。
上述方案中,所述方法还包括:
确定所述第二模型的第五概率、第六概率、第七概率和第八概率;所述第五概率表征所述第二模型将输入图像中的边缘识别为边缘的概率;所述第六概率表征所述第二模型将输入图像中的边缘识别为非边缘的概率;所述第七概率表征所述第二模型将输入图像中的非边缘识别为边缘的概率;所述第八概率表征所述第二模型将输入图像中的非边缘识别为非边缘的概率;
利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值。
上述方案中,所述根据所述第一信息和第二信息,确定所述多个物体是否被移动,包括:
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化的情况下,和/或,在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,确定所述多个物体中的至少一个物体被移动;
或者,
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化,且所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化的情况下,确定所述多个物体未被移动。
上述方案中,所述方法还包括:
确定所述多个物体中的至少一个物体被移动时,发出告警信息;其中,
所述第二信息是利用多个第三系数确定的;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;所述第一图像对应多个第一栅格;所述第二图像对应多个第二栅格;在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,所述告警信息包含至少一个栅格标识;每个栅格标识对应一个大于第二阈值的第三系数;所述至少一个栅格标识用于定位被移动的物体。
上述方案中,所述获取目标区域的第一图像和第二图像,包括:
获取第一区域的第九图像和第十图像;所述第一区域至少包含所述目标区域;所述第九图像和第十图像对应的图像采集时刻不同;
根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域;所述第二区域存在以堆砌形式放置的多个物体;
从所述至少一个第二区域中确定目标区域,并基于所述目标区域对所述第九图像和第十图像进行裁剪,得到所述第一图像和第二图像。
上述方案中,所述根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域,包括:
利用所述第九图像、所述第十图像和第三模型,在所述第一区域中确定至少一个第二区域;所述第三模型利用目标检测算法训练。
本申请实施例还提供了一种图像处理装置,包括:
第一处理单元,配置为获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
第二处理单元,配置为根据所述第一图像和第二图像,确定第一信息和第 二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
第三处理单元,配置为根据所述第一信息和第二信息,确定所述多个物体是否被移动。
本申请实施例还提供了一种电子设备,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器,
其中,所述处理器用于运行所述计算机程序时,执行上述任一方法的步骤。
本申请实施例还提供了一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一方法的步骤。
本申请实施例提供的图像处理方法、装置、电子设备及存储介质,获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;根据所述第一信息和第二信息,确定所述多个物体是否被移动。本申请实施例提供的方案,针对存在以堆砌形式放置的多个物体的目标区域,根据在不同时刻采集的图像,从整体和局部两个角度对物体位置的变化进行识别,换句话说,根据多个物体在图像中的外轮廓的变化情况(即整体)和多个物体在图像中的内部纹理的变化情况(即局部)判断多个物体是否被移动;如此,能够通过计算机视觉技术(即对目标区域的图像进行处理)对物体的状态进行监控,从而智能地判断物体是否被移动,避免人工巡检对人力资源的浪费;并且,与人工巡检相比,通过从整体和局部两个角度对物体位置的变化进行识别,能够识别人眼难以察觉的、细微的物体移动事件,提高判断物体是否被移动的判断结果的准确性。
附图说明
图1为本申请实施例图像处理方法的流程示意图;
图2为本申请应用实施例货物(即物体)轮廓信息和纹理信息示意图;
图3为本申请应用实施例货物区域检测示意图;
图4为本申请应用实施例货物区域分割示意图;
图5为本申请应用实施例货物纹理检测示意图;
图6为本申请实施例图像处理装置的结构示意图;
图7为本申请实施例电子设备的结构示意图。
具体实施方式
下面结合附图及实施例对本申请再作进一步详细的描述。
相关技术中,通常通过人工巡检判断物体是否被移动,比如通过人工巡检对仓库中的货物进行看管。这种方式会造成对人力资源的浪费,并且,人眼难以察觉一些细微的物体移动事件。
基于此,在本申请的各种实施例中,针对存在以堆砌形式放置的多个物体的目标区域,根据在不同时刻采集的图像,从整体和局部两个角度对物体位置的变化进行识别,换句话说,根据多个物体在图像中的外轮廓的变化情况(即整体)和多个物体在图像中的内部纹理的变化情况(即局部)判断多个物体是否被移动;如此,能够通过计算机视觉技术(即对目标区域的图像进行处理)对物体的状态进行监控,从而智能地判断物体是否被移动,避免人工巡检对人力资源的浪费;并且,与人工巡检相比,通过从整体和局部两个角度对物体位置的变化进行识别,能够识别人眼难以察觉的、细微的物体移动事件,提高判断物体是否被移动的判断结果的准确性。
本申请实施例提供一种图像处理方法,应用于电子设备(比如服务器),如图1所示,该方法包括:
步骤101:获取目标区域的第一图像和第二图像;
这里,所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第 二图像对应的图像采集时刻不同;
步骤102:根据所述第一图像和第二图像,确定第一信息和第二信息;
这里,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
步骤103:根据所述第一信息和第二信息,确定所述多个物体是否被移动。
其中,所述第一图像对应第一图像采集时刻,所述第二图像对应第二图像采集时刻。可以理解,所述确定所述多个物体是否被移动,是指判断所述多个物体在所述第一图像采集时刻至所述第二图像采集时刻的时间范围内是否被移动。
实际应用时,由于所述多个物体以堆砌形式放置,因此,可以将所述多个物体看作一个整体,当所述多个物体中位于边缘的物体被移动时,所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;当所述多个物体中位于内部的物体被移动时,所述多个物体在所述第一图像和第二图像中的内部纹理发生变化。其中,所述内部纹理可以理解为所述多个物体中位于内部的物体的轮廓以及外包装的图案等信息。
在步骤101中,实际应用时,所述获取目标区域的第一图像和第二图像,可以包括:从图像采集装置获取所述图像采集装置采集的目标区域的第一图像和第二图像,所述图像采集装置的位置和图像采集角度固定。
实际应用时,在一些业务场景中,可能存在分别在多个位置以堆砌形式放置的多个物体,比如仓库中分堆摞放的货物。为了提高图像处理效率,图像采集装置可以采集包含多堆物体的图像,所述电子设备可以从所述图像采集装置获取到包含多堆物体的图像,并从该图像中检测每一堆物体所在的区域,针对每个检测到的区域执行所述步骤101至所述步骤103,从而确定每一堆物体中是否存在被移动的物体。
基于此,在一实施例中,所述获取目标区域的第一图像和第二图像,可以包括:
获取第一区域的第九图像和第十图像;所述第一区域至少包含所述目标区域;所述第九图像和第十图像对应的图像采集时刻不同;
根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域;所述第二区域存在以堆砌形式放置的多个物体;
从所述至少一个第二区域中确定目标区域,并基于所述目标区域对所述第九图像和第十图像进行裁剪,得到所述第一图像和第二图像。
这里,可以理解,所述第一区域存在至少一堆以堆砌形式放置的多个物体,每个第二区域存在一堆以堆砌形式放置的多个物体。
实际应用时,所述电子设备可以从所述图像采集装置获取所述图像采集装置采集的第九图像和第十图像,所述第九图像对应所述第一图像采集时刻,所述第十图像对应所述第二图像采集时刻。另外,采集所述第九图像和第十图像时,所述图像采集装置的位置可以固定或不固定,所述图像采集装置的图像采集角度也可以固定或不固定,具体可以根据需求设置,本申请实施例对此不作限定。
实际应用时,可以理解,在所述图像采集装置的位置和/或图像采集角度不固定的情况下,需要通过图像对比等方式对所述九图像和第十图像进行处理,以实所述第九图像中的物体能够与所述第十图像中的物体对应。
实际应用时,可以利用预先训练的模型在所述第一区域中确定每一堆物体,即在所述第一区域中确定至少一个第二区域。
基于此,在一实施例中,所述根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域,可以包括:
利用所述第九图像、所述第十图像和第三模型,在所述第一区域中确定至少一个第二区域;所述第三模型利用目标检测算法训练。
实际应用时,所述目标检测算法可以包括yolo_v5、faster-rcnn、centerNet等,具体可以根据需求设置,本申请实施例对此不作限定。
实际应用时,需要预先训练所述第三模型。具体地,可以确定训练数据集,所述训练数据集可以包含通过所述图像采集装置采集的预设区域(可以根据需 求设置,该区域需要存在以堆砌形式放置的多个物体)的预定数值(比如2000)张图像,对每张图像中的每一堆物体进行框定,记录(即标注)每一堆物体对应的坐标信息;在完成标注后,使用标注数据和所述目标检测算法对所述第三模型进行训练。
实际应用时,可以理解,在所述图像采集装置的位置和/或图像采集角度不固定的情况下,需要将所述第九图像输入所述第三模型,得到所述第三模型输出的至少一个候选第二区域,再将所述第十图像输入所述第三模型,得到所述第三模型输出的至少一个候选第二区域;通过将所述第三模型基于所述第九图像输出的至少一个候选第二区域与所述第三模型基于所述第十图像输出的至少一个候选第二区域进行关联,确定至少一个第二区域(可以将所述第三模型两次均输出的、对应同一堆物体的候选第二区域确定为第二区域)。
在所述图像采集装置的位置和图像采集角度固定的情况下,只需要将所述第九图像输入所述第三模型,得到所述第三模型输出的至少一个第二区域
实际应用时,所述第二区域可以为矩形,利用所述第三模型从通过所述图像采集装置获取的图像中检测每一堆物体所占据的矩形区域,能够确保后续的图像处理流程不被外部的与物体无关的信息干扰。
在步骤102中,实际应用时,可以利用预先训练的模型对所述第一图像和第二图像进行处理,以确定所述第一信息。
基于此,在一实施例中,所述根据所述第一图像和第二图像,确定第一信息,可以包括:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点(即非物体对应的像素点)的像素值为第二值;
通过对所述第三图像和所述第四图像进行对比,确定所述第一信息。
实际应用时,所述第一值可以为1,所述第二值可以为0。
实际应用时,所述语义分割算法可以包括deeplab_v3、U-net等,具体可以根据需求设置,本申请实施例对此不作限定。
实际应用时,需要预先训练所述第一模型。具体地,可以确定训练数据集,所述训练数据集可以包含通过所述图像采集装置采集的预设区域(可以根据需求设置,该区域需要存在以堆砌形式放置的多个物体)的预定数值(比如2000)的图像(可与训练所述第三模型所使用的图像相同),对每张图像中的每一堆物体的外轮廓的坐标位置进行标记;使用标记的数据和所述语义分割算法对所述第一模型进行训练。
在一实施例中,所述通过对所述第三图像和所述第四图像进行对比,确定所述第一信息,可以包括:
根据所述第三图像和所述第四图像,确定多个第一系数;每个第一系数表征一个像素点在所述第三图像的像素值是否与在所述第四图像的像素值相同;
利用所述多个第一系数,确定所述第一信息。
实际应用时,确定所述第一系数的具体方式可以根据需求设置。示例性地,当一个像素点在所述第三图像的像素值与在所述第四图像的像素值相同时,所述第一系数可以等于0;当一个像素点在所述第三图像的像素值与在所述第四图像的像素值不同时,所述第一系数可以等于1。
在一实施例中,所述利用所述多个第一系数,确定所述第一信息,可以包括:
利用所述多个第一系数,确定第二系数;所述第二系数表征所述第三图像和所述第四图像的匹配程度;
判断所述第二系数是否大于第一阈值;在所述第二系数大于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;或者,在所述第二系数小于或等于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化。
实际应用时,计算所述第二系数的具体方式可以根据需求设置。可以理解, 所述第二系数越大,所述第三图像和所述第四图像的匹配程度越低。
实际应用时,可以通过统计所述第一模型在预设的验证数据集上的效果,确定所述第一阈值。
基于此,在一实施例中,该方法还可以包括:
确定所述第一模型的第一概率、第二概率、第三概率和第四概率;所述第一概率表征所述第一模型将输入图像(比如所述第一图像和所述第二图像)中的物体识别为物体的概率,即在对输入图像进行二值化处理的过程中将物体对应的像素点的像素值确定为第一值的概率;所述第二概率表征所述第一模型将输入图像中的物体识别为非物体的概率,即在对输入图像进行二值化处理的过程中将物体对应的像素点的像素值确定为第二值的概率;所述第三概率表征所述第一模型将输入图像中的非物体识别为物体的概率,即在对输入图像进行二值化处理的过程中将非物体对应的像素点的像素值确定为第一值的概率;所述第四概率表征所述第一模型将输入图像中的非物体识别为非物体的概率,即在对输入图像进行二值化处理的过程中将非物体对应的像素点的像素值确定为第二值的概率;
利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值。
这里,可以通过统计所述第一模型在预设的验证数据集上的效果,确定所述第一概率、第二概率、第三概率和第四概率。
实际应用时,利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值的具体方式可以根据需求设置。示例性地,利用所述第一概率、第二概率、第三概率和第四概率,可以结合伯努利分布、二项分布、中心极限定理、高斯分布、3σ原则等思想,确定所述第一阈值。
在步骤102中,实际应用时,可以利用预先训练的模型对所述第一图像和第二图像进行处理,以确定所述第二信息。
基于此,在一实施例中,所述根据所述第一图像和第二图像,确定第二信息,可以包括:
利用第二模型对所述第一图像进行二值化处理,得到第五图像;并利用所 述第二模型对所述第二图像进行二值化处理,得到第六图像;所述第二模型利用边缘检测算法训练;所述第五图像和所述第六图像中边缘对应的像素点的像素值为第一值,非边缘对应的像素点的像素值为第二值;
至少利用所述第五图像和所述第六图像,确定所述第二信息。
这里,所述第一值可以为1,所述第二值可以为0。
实际应用时,所述边缘检测算法可以包括PiDiNet等,具体可以根据需求设置,本申请实施例对此不作限定。
实际应用时,可以预先训练所述第二模型,或者,所述第二模型可以是开源的模型。
实际应用时,为了进一步确保后续的图像处理流程不被外部的与物体无关的信息干扰,可以利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息。
基于此,在一实施例中,所述至少利用所述第五图像和所述第六图像,确定所述第二信息,可以包括:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息。
具体地,在一实施例中,所述利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息,可以包括:
将所述第三图像和所述第五图像对位相乘,得到第七图像;并将所述第四图像和所述第六图像对位相乘,得到第八图像;
通过对所述第七图像和所述第八图像进行对比,确定所述第二信息。
这里,将所述第三图像和所述第五图像对位相乘,并将所述第四图像和所 述第六图像对位相乘,能够排除外部的与物体无关的信息的干扰,换句话说,所述第七图像和所述第八图像不包含外部的与物体无关的信息,通过对所述第七图像和所述第八图像进行对比,确定所述第二信息,能够确保后续的图像处理流程不被外部的与物体无关的信息干扰,进一步提高判断结果的准确性。
实际应用时,以堆砌形式放置的多个物体可能数量较多,为了进一步提高判断结果的准确性,可以先对所述第七图像和所述第八图像进行栅格划分,再以栅格为单位对所述第七图像的栅格和所述第八图像的栅格进行对比,确定以堆砌形式放置的多个物体中局部的纹理变化情况。
基于此,在一实施例中,所述通过对所述第七图像和所述第八图像进行对比,确定所述第二信息,包括:
基于预设规则对所述第七图像进行划分,得到多个第一栅格;并基于所述预设规则对所述第八图像进行划分,得到多个第二栅格;
根据所述多个第一栅格和所述多个第二栅格,确定多个第三系数;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;
利用所述多个第三系数,确定所述第二信息。
这里,所述预设规则可以根据需求设置。示例性地,所述预设规则可以包括:将图像划分为H×W个栅格,H和W均为大于0的整数,H和W可以相同或不同,H和W的具体取值可以根据经验设定。
具体地,在一实施例中,所述利用所述多个第三系数,确定所述第二信息,可以包括:
判断每个第三系数是否大于第二阈值;
在存在第三系数大于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化;或者,在每个第三系数均小于或等于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化。
实际应用时,可以通过统计所述第二模型在预设的验证数据集上的效果,确定所述第二阈值。
基于此,在一实施例中,该方法还可以包括:
确定所述第二模型的第五概率、第六概率、第七概率和第八概率;所述第五概率表征所述第二模型将输入图像(比如所述第一图像和所述第二图像)中的边缘识别为边缘的概率,即在对输入图像进行二值化处理的过程中将边缘对应的像素点的像素值确定为第一值的概率;所述第六概率表征所述第二模型将输入图像中的边缘识别为非边缘的概率,即在对输入图像进行二值化处理的过程中将边缘对应的像素点的像素值确定为第二值的概率;所述第七概率表征所述第二模型将输入图像中的非边缘识别为边缘的概率,即在对输入图像进行二值化处理的过程中将非边缘对应的像素点的像素值确定为第一值的概率;所述第八概率表征所述第二模型将输入图像中的非边缘识别为非边缘的概率,即在对输入图像进行二值化处理的过程中将非边缘对应的像素点的像素值确定为第二值的概率;
利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值。
这里,可以通过统计所述第二模型在预设的验证数据集上的效果,确定所述第五概率、第六概率、第七概率和第八概率。
实际应用时,利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值的具体方式可以根据需求设置。示例性地,利用所述第五概率、第六概率、第七概率和第八概率,可以结合伯努利分布、二项分布、中心极限定理、高斯分布、3σ原则等思想,确定所述第二阈值。
对于步骤103,在一实施例中,所述根据所述第一信息和第二信息,确定所述多个物体是否被移动,可以包括:
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化的情况下,和/或,在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,确定所述多个物体中的至少一个物体被移动;
或者,
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓 未发生变化,且所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化的情况下,确定所述多个物体未被移动。
实际应用时,确定所述多个物体中的至少一个物体被移动时,可以向目标设备发出告警信息,以提示用户所述目标区域存在被移动的物体。
基于此,在一实施例中,该方法还可以包括:
确定所述多个物体中的至少一个物体被移动时,发出告警信息;其中,
所述第二信息是利用多个第三系数确定的;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;所述第一图像对应多个第一栅格;所述第二图像对应多个第二栅格;在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,所述告警信息包含至少一个栅格标识;每个栅格标识对应一个大于第二阈值的第三系数;所述至少一个栅格标识用于定位被移动的物体。
实际应用时,所述告警信息的具体发送对象(即所述目标设备)可以根据需求设置,本申请实施例对此不作限定。
实际应用时,基于本申请实施例提供的图像处理方法,能够监测指定区域(比如第一区域)是否存在被移动的物体。其中,所述第九图像可以作为原始状态图像,即所述第九图像可以反映物体的原始状态(比如入库时的状态);所述第十图像可以作为当前状态图像,即所述第十图像可以反映物体的当前状态。另外,所述第九图像可以根据所述多个物体对应的业务状态(比如新入库了货物或出库了货物)更新,所述第十图像可以周期性更新或触发性更新,所述周期性更新可以包括所述图像采集装置按预设周期(可以根据需求设置,比如n秒,n为大于0的整数)采集所述第一区域的第十图像并发送至所述电子设备;所述触发性更新可以包括所述电子设备在接收到其他设备(比如终端)的检测指令时从所述图像采集装置获取第十图像。
本申请实施例提供的图像处理方法,获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;根据所述第一图像和第二图像,确定第一信息和第二信 息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;根据所述第一信息和第二信息,确定所述多个物体是否被移动。本申请实施例提供的方案,针对存在以堆砌形式放置的多个物体的目标区域,根据在不同时刻采集的图像,从整体和局部两个角度对物体位置的变化进行识别,换句话说,根据多个物体在图像中的外轮廓的变化情况(即整体)和多个物体在图像中的内部纹理的变化情况(即局部)判断多个物体是否被移动;如此,能够通过计算机视觉技术(即对目标区域的图像进行处理)对物体的状态进行监控,从而智能地判断物体是否被移动,避免人工巡检对人力资源的浪费;并且,与人工巡检相比,通过从整体和局部两个角度对物体位置的变化进行识别,能够识别人眼难以察觉的、细微的物体移动事件,提高判断物体是否被移动的判断结果的准确性。
下面结合应用实施例对本申请再作进一步详细的描述。
本应用实施例提供了基于计算机视觉的仓库货物(即上述物体)看管方案,从整体和局部两个角度对货物的变化进行识别。其中,货物在整体角度的变化为外轮廓的变化,局部角度的变化为货物在图像上所对应区域的一些纹理变化。货物在仓库中一般会以堆砌的形式存放,如果图像中货物区域边缘的货物被移动则会造成对应位置的轮廓信息的变化(即上述第一信息),轮廓信息如图2中的方框201所示。如果图像中货物区域内部的货物被移动则会造成对应位置的纹理信息的变化(即上述第二信息),纹理信息如图2中的方框202所示。
在本应用实施例中,为了对上述现象(即轮廓信息的变化和纹理信息的变化)进行有效识别,所述仓库货物看管方案包括以下步骤:
步骤1:利用区域检测模型(即上述第三模型)进行货物区域检测;
步骤2:利用分割模型(即上述第一模型)进行货物区域分割;
步骤3:利用边缘检测模型(即上述第二模型)进行货物纹理检测;
步骤4:最终匹配。
上述四个步骤中的每个步骤之间相辅相成,下面对每个步骤的具体实现进 行说明。
首先,对利用区域检测模型(即上述第三模型)进行货物区域检测的具体实现进行说明。
在本应用实施例中,区域检测模型的作用是从通过监控摄像头(即上述图像采集装置)获取的图像(比如上述第九图像和第十图像)中检测货物所占据的矩形区域(即上述第二区域),以确保在后续的流程中不被外部的与货物无关的信息干扰。具体效果如图3所示,会对每一堆货物进行框定,得到相应的矩形框(比如矩形框301)。
在本应用实施例中,区域检测模型利用yolo_v5算法进行区域检测。在训练区域检测模型时,选取仓库中使用摄像头检测到的约2000张图片,对每张图片中的每一堆货物都进行框定,记录每个矩形框对应的坐标信息。在完成标注之后,使用标注数据对区域检测模型进行训练。最终得到的区域检测模型能够对新采集的图片中每一堆货物进行检测,并标定对应的矩形框。
其次,对利用分割模型(即上述第一模型)进行货物区域分割的具体实现进行说明。
在本应用实施例中,分割模型的作用是得到每一堆货物的外轮廓,以判断货物的轮廓信息是否发生改变。分割模型利用deeplab_v3算法进行图像处理,分割模型的输出(如图4所示)为一张和输入图片大小一样大的矩阵(比如上述第三图像和第四图像),矩阵中对应了货物的像素位置的值为1(即上述第一值),其余部分(即非货物部分)的值为0(即上述第二值)。
在本应用实施例中,需要预先获取标记数据并利用标记数据训练分割模型,获取标记数据的方式可以包括:选取仓库中使用摄像头检测到的约2000张图片(可以使用在区域检测模型训练中使用过的图片),将图片中每一堆货物的外轮廓的坐标位置标记出来,根据标记训练分割模型。训练完成后,分割模型会对新的输入图片进行分析,生成一个与输入图片同样大小的0,1矩阵,其中,货物对应的区域为1,其余部分为0。
第三,对利用边缘检测模型(即上述第二模型)进行货物纹理检测的具体 实现进行说明。
在本应用实施例中,边缘检测模型的作用为对输入图片(比如上述第一图像和第二图像)中局部的纹理进行识别,以判断纹理信息是否比原来发生改变。边缘检测模型使用PiDiNet,通过识别图片内部发生突变的区域来提取重要的纹理,边缘检测模型的输出图像如图5所示,为一个和原图同样大小的0,1矩阵,与边缘对应的像素位置值为1(即上述第一值),其余位置为0(即上述第二值)。
实际应用时,由于边缘检测为通用的算法,不局限于货物的边缘检测,因此,所述边缘检测模型可以是开源的模型,不需要重新训练。
第四,对最终匹配过程进行详细说明。
在本应用实施例中,货物区域检测、货物区域分割和货物纹理检测三个步骤均为最终识别(即最终匹配)过程服务。在货物进入仓库之后,要使用用来看管货物的摄像头获取一张货物的原始图片,接下来周期性采集货物后续的图片,后续每次获得图片时,需要通过与原始图片进行对比,判断货物是否发生移动。货物的原始图片和后续获得的图片都需要经过货物区域检测、货物区域分割和货物纹理检测三个步骤之后再进行最终的对比。
实际应用时,在仓库场景中,监控摄像头的位置和角度通常是固定不变的。因此,可以只对原始货物图片进行货物区域检测,得到的检测框同时也使用在后续获得的货物图片上。但货物原始图片和后续采集的货物图片都需要经过货物区域分割和货物纹理检测。
实际应用时,当货物状态有更新时,比如新入库了货物或出库了货物,可以更新原始图片。
在本应用实施例中,在完成了以上步骤(即货物区域检测、货物区域分割和货物纹理检测)之后,需要对原始货物图片(表示为S,即上述第九图像)和后续获得的货物图片(表示为D,即上述第十图像)进行对比:先对比轮廓信息,再对比局部信息。图片中可能含有多堆货物,需要将每一堆货物单独进行对比。由于在货物区域检测过程中已经得到每一堆货物对应的检测框(即第二区域),因此按照检测框把S和D图片进行裁剪,得到每一堆货物的子图,分别 表示为S(1),S(2),…S(n)和D(1),D(2),…D(n)。假设当前要对子图S(i)(即上述第一图像)和D(i)(即上述第二图像)进行对比,则对轮廓信息的对比过程包括以下步骤:
1)获取子图S(i)和D(i)对应的分割图,表示为Sg(i)(即上述第三图像)和Dg(i)(即上述第四图像)。
2)通过以下公式计算差异系数(即上述第二系数):
其中,(x,y)表示图中一个位置的坐标;Sg(i)(x,y)和Dg(i)(x,y)分别表示两张分割图(即Sg(i)和Dg(i))中对应该位置的取值;表示异或操作,如果分割模型对原始货物图片和后采集图片中的某个像素是否为货物部分的判断一致,(即上述第一系数)的输出为0,否则为1。差异系数的值越大,则Sg(i)和Dg(i)越不匹配。
3)差异系数大于一定阈值(即上述第一阈值),则判定该子图对应的一堆货物发生了移动。
对纹理信息的对比过程包括以下步骤:
1)获取子图S(i)和D(i)对应的分割图,表示为Sg(i)和Dg(i)。
2)获取子图S(i)和D(i)对应的边缘图(使用边缘检测模型后的输出),表示为Se(i)(即上述第五图像)和De(i)(即上述第六图像)。
3)将Sg(i)和Se(i)对位相乘,得到St(i)(即上述第七图像);并将Dg(i)和De(i)对位相乘,得到Dt(i)(即上述第八图像)。
4)将St(i)和Dt(i)分别划分为H×W个(即上述预设策略,H和W均为大于0的整数,H和W可以相同或不同)网格(即上述栅格),每个网格对应的部分分别表示为St(i)(h,w)和Dt(i)(h,w);H和W可以根据经验设定。
5)通过以下公式对每部分网格计算差异系数(即上述第三系数):
其中,的含义 类似,这里不再赘述。
另外,先划分网格,再计算差异系数的原因是:移动有可能只是发生在某个局部,直接对整个图片进行对比,得到的差异会比较小,难以准确判断货物是否发生了移动;而划分网格可以有效解决这个问题。另外,划分网格也能够帮助定位移动发生的具体位置。
6)差异系数大于一定阈值(即上述第二阈值),则判断相应网格中对应的部分(即区域)发生了移动。
通过上述步骤可以看出,轮廓和纹理的对比都需要把差异系数和相应的阈值进行比较。而阈值的设定通常是比较难的一个问题。因此,在本应用实施例中,使用统计学手段对阈值进行推导,帮助更加准确进行判断。
在本应用实施例中,对轮廓信息进行比较时,对阈值(即第一阈值)的推导过程如下:
获得分割模型把货物部分识别为货物部分、把货物部分识别为其它部分、把其它部分识别为货物部分、把其它部分识别为其它部分的概率,分别表示为pTT(即第一概率)、pTF(即第二概率)、pFT(即第三概率)、pFF(即第四概率),这些概率值可以通过统计分割模型在验证数据集上的效果得到。对于假设其真实值表示为e(x,y),通过模型计算得到的值为如果两张子图完全匹配,则对于任意(x,y),e(x,y)=0计算得到的的概率可以表示为:
其中,计算得到的的概率可以表示为:
因此,在e(x,y)=0的情况下,服从伯努利分布服从二项分布根据中心极限定理,可以用高斯分布近似,根据3σ原则,如果在两张图片完全匹配的情况下,的最大值不应该超 因此,将差异系数的阈值(第一阈值)设定为:
在本应用实施例中,对纹理信息进行比较时,对阈值(即第二阈值)的推导过程如下:
获得边缘检测模型把边缘识别为边缘、边缘识别为非边缘、非边缘识别为边缘、非边缘识别为非边缘的概率,分别表示为qTT(即第五概率)、qTF(即第六概率)、qFT(即第七概率)、qFF(即第八概率),这些概率值可以通过统计边缘检测模型在验证数据集上的效果得到。对于假设其真实值表示为g(x,y),通过模型计算得到的值为如果两个网格完全匹配,则对于任意(x,y),g(x,y)=0计算得到的的概率可以表示为:
其中,计算得到的的概率可以表示为:
继续重复与轮廓比较的阈值(即第一阈值)相同的推导过程,可知,第二阈值应该设定为:
在本应用实施例中,图像处理过程的具体流程如下:
1)模型训练。按照预设规则进行数据打标,训练区域检测模型和分割模型,并获取边缘检测模型。同时,使用验证数据集分别估计分割模型和边缘检测模型相应的概率值pTT、pTF、pFT、pFF和qTT、qTF、qFT、qFF
2)货物区域检测。在完成了货物的入库或状态更新之后,使用相应的摄像头采集货物的原始图片,并使用检测模型得到每一堆货物相应的检测框。
3)货物区域分割。对于原始货物图片进行货物区域分割,得到相应的分割图,在货物状态没有更新的情况下,该操作只需要进行一次;对周期性采集的货物图片进行货物区域分割,得到相应的分割图,每次采集到新的图片时,都要进行一次。
4)货物纹理检测。对于原始货物图片使用边缘检测算法进行纹理检测,得到相应边缘图片,在货物状态没有更新的情况下,该操作只需要进行一次;对周期性采集的货物图片进行边缘检测,得到相应的边缘图片,每次采集到新的图片时,都要进行一次。
5)轮廓对比。获得每一对子图S(i)和D(i)对应的分割子图Sg(i)和Dg(i),计算其差异系数(即第二系数),并推导相应的阈值(即第一阈值)。如果差异系数高于阈值,则判断对应的货物被移动,报警并申请人工核验。
6)纹理对比。获得每一对子图S(i)和D(i)对应的网格,及每个网格对应的边缘子图St(i)(h,w)和Dt(i)(h,w),计算其差异系数(即第三系数),并推导相应的阈值(即第二阈值)。如果差异系数高于阈值,则判断对应的货物中相应的区域被移动,报警并申请人工核验。
本应用实施例提供的方案,使用计算机视觉技术,利用仓库中的监控摄像头对仓库中的货物进行看管,判断其是否被移动;并且,使用图像分割和边缘检测技术获得货物的轮廓和纹理信息,根据这两种信息比对货物的状态(即位置),从整体和局部两个维度判断货物是否被移动;另外,还使用统计学方法推导得到货物原始状态和后采集状态差异的阈值,帮助对货物是否被移动进行更加精确的判断。
本应用实施例提供的方案,基于计算机视觉技术实现了对仓库货物的状态(即位置)监控,如果发现货物的状态发生变化(即位置变化),就及时进行报警,申请人工核验。由于仓库中一般存在有大量的监控摄像头,这种方式能够充分利用现有资源,并有效减少对人力的消耗。同时,计算机视觉也能够识别一些不易被人眼察觉的细微变化。
为了实现本申请实施例的方法,本申请实施例还提供了一种图像处理装置, 设置在电子设备上(比如安装在服务器上),如图6所示,该装置包括:
第一处理单元601,配置为获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
第二处理单元602,配置为根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
第三处理单元603,配置为根据所述第一信息和第二信息,确定所述多个物体是否被移动。
其中,在一实施例中,所述第二处理单元602,还配置为:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
通过对所述第三图像和所述第四图像进行对比,确定所述第一信息。
在一实施例中,所述第二处理单元602,还配置为:
根据所述第三图像和所述第四图像,确定多个第一系数;每个第一系数表征一个像素点在所述第三图像的像素值是否与在所述第四图像的像素值相同;
利用所述多个第一系数,确定所述第一信息。
在一实施例中,所述第二处理单元602,还配置为:
利用所述多个第一系数,确定第二系数;所述第二系数表征所述第三图像和所述第四图像的匹配程度;
判断所述第二系数是否大于第一阈值;在所述第二系数大于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;或者,在所述第二系数小于或等于所述第一阈值的情况下,所 述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化。
在一实施例中,所述第二处理单元602,还配置为:
确定所述第一模型的第一概率、第二概率、第三概率和第四概率;所述第一概率表征所述第一模型将输入图像中的物体识别为物体的概率;所述第二概率表征所述第一模型将输入图像中的物体识别为非物体的概率;所述第三概率表征所述第一模型将输入图像中的非物体识别为物体的概率;所述第四概率表征所述第一模型将输入图像中的非物体识别为非物体的概率;
利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值。
在一实施例中,所述第二处理单元602,还配置为:
利用第二模型对所述第一图像进行二值化处理,得到第五图像;并利用所述第二模型对所述第二图像进行二值化处理,得到第六图像;所述第二模型利用边缘检测算法训练;所述第五图像和所述第六图像中边缘对应的像素点的像素值为第一值,非边缘对应的像素点的像素值为第二值;
至少利用所述第五图像和所述第六图像,确定所述第二信息。
在一实施例中,所述第二处理单元602,还配置为:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息。
在一实施例中,所述第二处理单元602,还配置为:
将所述第三图像和所述第五图像对位相乘,得到第七图像;并将所述第四图像和所述第六图像对位相乘,得到第八图像;
通过对所述第七图像和所述第八图像进行对比,确定所述第二信息。
在一实施例中,所述第二处理单元602,还配置为:
基于预设规则对所述第七图像进行划分,得到多个第一栅格;并基于所述预设规则对所述第八图像进行划分,得到多个第二栅格;
根据所述多个第一栅格和所述多个第二栅格,确定多个第三系数;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;
利用所述多个第三系数,确定所述第二信息。
在一实施例中,所述第二处理单元602,还配置为:
判断每个第三系数是否大于第二阈值;
在存在第三系数大于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化;或者,在每个第三系数均小于或等于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化。
在一实施例中,所述第二处理单元602,还配置为:
确定所述第二模型的第五概率、第六概率、第七概率和第八概率;所述第五概率表征所述第二模型将输入图像中的边缘识别为边缘的概率;所述第六概率表征所述第二模型将输入图像中的边缘识别为非边缘的概率;所述第七概率表征所述第二模型将输入图像中的非边缘识别为边缘的概率;所述第八概率表征所述第二模型将输入图像中的非边缘识别为非边缘的概率;
利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值。
在一实施例中,所述第三处理单元603,还配置为:
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化的情况下,和/或,在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,确定所述多个物体中的至少一个物体被移动;
或者,
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化,且所述第二信息表征所述多个物体在所述第一图像和第二图像中 的内部纹理未发生变化的情况下,确定所述多个物体未被移动。
在一实施例中,该装置还包括通信单元;所述第三处理单元603,还配置为确定所述多个物体中的至少一个物体被移动时,通过所述通信模块发出告警信息;其中,
所述第二信息是利用多个第三系数确定的;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;所述第一图像对应多个第一栅格;所述第二图像对应多个第二栅格;在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,所述告警信息包含至少一个栅格标识;每个栅格标识对应一个大于第二阈值的第三系数;所述至少一个栅格标识用于定位被移动的物体。
在一实施例中,所述第一处理单元601,还配置为:
获取第一区域的第九图像和第十图像;所述第一区域至少包含所述目标区域;所述第九图像和第十图像对应的图像采集时刻不同;
根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域;所述第二区域存在以堆砌形式放置的多个物体;
从所述至少一个第二区域中确定目标区域,并基于所述目标区域对所述第九图像和第十图像进行裁剪,得到所述第一图像和第二图像。
在一实施例中,所述第一处理单元601,还配置为利用所述第九图像、所述第十图像和第三模型,在所述第一区域中确定至少一个第二区域;所述第三模型利用目标检测算法训练。
实际应用时,所述通信单元可由图像处理装置中的通信接口实现;所述第一处理单元601、所述第二处理单元602和所述第三处理单元603可由图像处理装置中的处理器实现。
需要说明的是:上述实施例提供的图像处理装置在处理图像时,仅以上述各程序模块的划分进行举例说明,实际应用时,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的图像处理装置与图 像处理方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
基于上述程序模块的硬件实现,且为了实现本申请实施例的方法,本申请实施例还提供了一种电子设备,如图7所示,该电子设备700包括:
通信接口701,能够与其他电子设备进行信息交互;
处理器702,与所述通信接口701连接,以实现与其他电子设备进行信息交互,用于运行计算机程序时,执行上述一个或多个技术方案提供的方法;
存储器703,存储能够在所述处理器702上运行的计算机程序。
具体地,所述处理器702,配置为:
获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
根据所述第一信息和第二信息,确定所述多个物体是否被移动。
其中,在一实施例中,所述处理器702,还配置为:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
通过对所述第三图像和所述第四图像进行对比,确定所述第一信息。
在一实施例中,所述处理器702,还配置为:
根据所述第三图像和所述第四图像,确定多个第一系数;每个第一系数表征一个像素点在所述第三图像的像素值是否与在所述第四图像的像素值相同;
利用所述多个第一系数,确定所述第一信息。
在一实施例中,所述处理器702,还配置为:
利用所述多个第一系数,确定第二系数;所述第二系数表征所述第三图像和所述第四图像的匹配程度;
判断所述第二系数是否大于第一阈值;在所述第二系数大于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;或者,在所述第二系数小于或等于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化。
在一实施例中,所述处理器702,还配置为:
确定所述第一模型的第一概率、第二概率、第三概率和第四概率;所述第一概率表征所述第一模型将输入图像中的物体识别为物体的概率;所述第二概率表征所述第一模型将输入图像中的物体识别为非物体的概率;所述第三概率表征所述第一模型将输入图像中的非物体识别为物体的概率;所述第四概率表征所述第一模型将输入图像中的非物体识别为非物体的概率;
利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值。
在一实施例中,所述处理器702,还配置为:
利用第二模型对所述第一图像进行二值化处理,得到第五图像;并利用所述第二模型对所述第二图像进行二值化处理,得到第六图像;所述第二模型利用边缘检测算法训练;所述第五图像和所述第六图像中边缘对应的像素点的像素值为第一值,非边缘对应的像素点的像素值为第二值;
至少利用所述第五图像和所述第六图像,确定所述第二信息。
在一实施例中,所述处理器702,还配置为:
利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定 所述第二信息。
在一实施例中,所述处理器702,还配置为:
将所述第三图像和所述第五图像对位相乘,得到第七图像;并将所述第四图像和所述第六图像对位相乘,得到第八图像;
通过对所述第七图像和所述第八图像进行对比,确定所述第二信息。
在一实施例中,所述处理器702,还配置为:
基于预设规则对所述第七图像进行划分,得到多个第一栅格;并基于所述预设规则对所述第八图像进行划分,得到多个第二栅格;
根据所述多个第一栅格和所述多个第二栅格,确定多个第三系数;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;
利用所述多个第三系数,确定所述第二信息。
在一实施例中,所述处理器702,还配置为:
判断每个第三系数是否大于第二阈值;
在存在第三系数大于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化;或者,在每个第三系数均小于或等于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化。
在一实施例中,所述处理器702,还配置为:
确定所述第二模型的第五概率、第六概率、第七概率和第八概率;所述第五概率表征所述第二模型将输入图像中的边缘识别为边缘的概率;所述第六概率表征所述第二模型将输入图像中的边缘识别为非边缘的概率;所述第七概率表征所述第二模型将输入图像中的非边缘识别为边缘的概率;所述第八概率表征所述第二模型将输入图像中的非边缘识别为非边缘的概率;
利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值。
在一实施例中,所述处理器702,还配置为:
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化的情况下,和/或,在所述第二信息表征所述多个物体在所述第一图像 和第二图像中的内部纹理发生变化的情况下,确定所述多个物体中的至少一个物体被移动;
或者,
在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化,且所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化的情况下,确定所述多个物体未被移动。
在一实施例中,所述处理器702,还配置为确定所述多个物体中的至少一个物体被移动时,通过所述通信接口701发出告警信息;其中,
所述第二信息是利用多个第三系数确定的;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;所述第一图像对应多个第一栅格;所述第二图像对应多个第二栅格;在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,所述告警信息包含至少一个栅格标识;每个栅格标识对应一个大于第二阈值的第三系数;所述至少一个栅格标识用于定位被移动的物体。
在一实施例中,所述处理器702,还配置为:
获取第一区域的第九图像和第十图像;所述第一区域至少包含所述目标区域;所述第九图像和第十图像对应的图像采集时刻不同;
根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域;所述第二区域存在以堆砌形式放置的多个物体;
从所述至少一个第二区域中确定目标区域,并基于所述目标区域对所述第九图像和第十图像进行裁剪,得到所述第一图像和第二图像。
在一实施例中,所述处理器702,还配置为利用所述第九图像、所述第十图像和第三模型,在所述第一区域中确定至少一个第二区域;所述第三模型利用目标检测算法训练。
需要说明的是:所述处理器702具体执行上述操作的过程详见方法实施例,这里不再赘述。
当然,实际应用时,电子设备700中的各个组件通过总线系统704耦合在 一起。可理解,总线系统704用于实现这些组件之间的连接通信。总线系统704除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图7中将各种总线都标为总线系统704。
本申请实施例中的存储器703用于存储各种类型的数据以支持电子设备700的操作。这些数据的示例包括:用于在电子设备700上操作的任何计算机程序。
上述本申请实施例揭示的方法可以应用于处理器702中,或者由处理器702实现。处理器702可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器702中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器702可以是通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器702可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器703,处理器702读取存储器703中的信息,结合其硬件完成前述方法的步骤。
在示例性实施例中,电子设备700可以被一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)、通用处理器、控制器、微控制器(MCU,Micro Controller Unit)、微处理器(Microprocessor)、或者其他电子元件实现,用于执行前述方法。
可以理解,本申请实施例的存储器703可以是易失性存储器或者非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM, Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random Access Memory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,Synchronous Dynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本申请实施例描述的存储器旨在包括但不限于这些和任意其他适合类型的存储器。
在示例性实施例中,本申请实施例还提供了一种存储介质,即计算机存储介质,具体为计算机可读存储介质,例如包括存储计算机程序的存储器703,上述计算机程序可由电子设备700的处理器702执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器。
需要说明的是:“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
另外,本申请实施例所记载的技术方案之间,在不冲突的情况下,可以任意组合。
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范 围。

Claims (18)

  1. 一种图像处理方法,包括:
    获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
    根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
    根据所述第一信息和第二信息,确定所述多个物体是否被移动。
  2. 根据权利要求1所述的方法,所述根据所述第一图像和第二图像,确定第一信息,包括:
    利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
    通过对所述第三图像和所述第四图像进行对比,确定所述第一信息。
  3. 根据权利要求2所述的方法,所述通过对所述第三图像和所述第四图像进行对比,确定所述第一信息,包括:
    根据所述第三图像和所述第四图像,确定多个第一系数;每个第一系数表征一个像素点在所述第三图像的像素值是否与在所述第四图像的像素值相同;
    利用所述多个第一系数,确定所述第一信息。
  4. 根据权利要求3所述的方法,所述利用所述多个第一系数,确定所述第一信息,包括:
    利用所述多个第一系数,确定第二系数;所述第二系数表征所述第三图像和所述第四图像的匹配程度;
    判断所述第二系数是否大于第一阈值;在所述第二系数大于所述第一阈值 的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化;或者,在所述第二系数小于或等于所述第一阈值的情况下,所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化。
  5. 根据权利要求4所述的方法,所述方法还包括:
    确定所述第一模型的第一概率、第二概率、第三概率和第四概率;所述第一概率表征所述第一模型将输入图像中的物体识别为物体的概率;所述第二概率表征所述第一模型将输入图像中的物体识别为非物体的概率;所述第三概率表征所述第一模型将输入图像中的非物体识别为物体的概率;所述第四概率表征所述第一模型将输入图像中的非物体识别为非物体的概率;
    利用所述第一概率、第二概率、第三概率和第四概率,确定所述第一阈值。
  6. 根据权利要求1所述的方法,所述根据所述第一图像和第二图像,确定第二信息,包括:
    利用第二模型对所述第一图像进行二值化处理,得到第五图像;并利用所述第二模型对所述第二图像进行二值化处理,得到第六图像;所述第二模型利用边缘检测算法训练;所述第五图像和所述第六图像中边缘对应的像素点的像素值为第一值,非边缘对应的像素点的像素值为第二值;
    至少利用所述第五图像和所述第六图像,确定所述第二信息。
  7. 根据权利要求6所述的方法,所述至少利用所述第五图像和所述第六图像,确定所述第二信息,包括:
    利用第一模型对所述第一图像进行二值化处理,得到第三图像;并利用所述第一模型对所述第二图像进行二值化处理,得到第四图像;所述第一模型利用语义分割算法训练;所述第三图像和所述第四图像中所述多个物体对应的像素点的像素值为第一值,除所述多个物体对应的像素点外的其他像素点的像素值为第二值;
    利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息。
  8. 根据权利要求7所述的方法,所述利用所述第三图像、所述第四图像、所述第五图像和所述第六图像,确定所述第二信息,包括:
    将所述第三图像和所述第五图像对位相乘,得到第七图像;并将所述第四图像和所述第六图像对位相乘,得到第八图像;
    通过对所述第七图像和所述第八图像进行对比,确定所述第二信息。
  9. 根据权利要求8所述的方法,所述通过对所述第七图像和所述第八图像进行对比,确定所述第二信息,包括:
    基于预设规则对所述第七图像进行划分,得到多个第一栅格;并基于所述预设规则对所述第八图像进行划分,得到多个第二栅格;
    根据所述多个第一栅格和所述多个第二栅格,确定多个第三系数;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;
    利用所述多个第三系数,确定所述第二信息。
  10. 根据权利要求9所述的方法,所述利用所述多个第三系数,确定所述第二信息,包括:
    判断每个第三系数是否大于第二阈值;
    在存在第三系数大于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化;或者,在每个第三系数均小于或等于所述第二阈值的情况下,所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化。
  11. 根据权利要求10所述的方法,所述方法还包括:
    确定所述第二模型的第五概率、第六概率、第七概率和第八概率;所述第五概率表征所述第二模型将输入图像中的边缘识别为边缘的概率;所述第六概率表征所述第二模型将输入图像中的边缘识别为非边缘的概率;所述第七概率表征所述第二模型将输入图像中的非边缘识别为边缘的概率;所述第八概率表征所述第二模型将输入图像中的非边缘识别为非边缘的概率;
    利用所述第五概率、第六概率、第七概率和第八概率,确定所述第二阈值。
  12. 根据权利要求1至11任一项所述的方法,所述根据所述第一信息和第 二信息,确定所述多个物体是否被移动,包括:
    在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓发生变化的情况下,和/或,在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,确定所述多个物体中的至少一个物体被移动;
    或者,
    在所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓未发生变化,且所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理未发生变化的情况下,确定所述多个物体未被移动。
  13. 根据权利要求12所述的方法,所述方法还包括:
    确定所述多个物体中的至少一个物体被移动时,发出告警信息;其中,
    所述第二信息是利用多个第三系数确定的;每个第三系数表征一个第一栅格与对应的第二栅格的匹配程度;所述第一图像对应多个第一栅格;所述第二图像对应多个第二栅格;在所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理发生变化的情况下,所述告警信息包含至少一个栅格标识;每个栅格标识对应一个大于第二阈值的第三系数;所述至少一个栅格标识用于定位被移动的物体。
  14. 根据权利要求1至11任一项所述的方法,所述获取目标区域的第一图像和第二图像,包括:
    获取第一区域的第九图像和第十图像;所述第一区域至少包含所述目标区域;所述第九图像和第十图像对应的图像采集时刻不同;
    根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域;所述第二区域存在以堆砌形式放置的多个物体;
    从所述至少一个第二区域中确定目标区域,并基于所述目标区域对所述第九图像和第十图像进行裁剪,得到所述第一图像和第二图像。
  15. 根据权利要求14所述的方法,所述根据所述第九图像和第十图像,在所述第一区域中确定至少一个第二区域,包括:
    利用所述第九图像、所述第十图像和第三模型,在所述第一区域中确定至少一个第二区域;所述第三模型利用目标检测算法训练。
  16. 一种图像处理装置,包括:
    第一处理单元,配置为获取目标区域的第一图像和第二图像;所述目标区域存在以堆砌形式放置的多个物体;所述第一图像和第二图像对应的图像采集时刻不同;
    第二处理单元,配置为根据所述第一图像和第二图像,确定第一信息和第二信息;所述第一信息表征所述多个物体在所述第一图像和第二图像中的外轮廓的变化情况;所述第二信息表征所述多个物体在所述第一图像和第二图像中的内部纹理的变化情况;
    第三处理单元,配置为根据所述第一信息和第二信息,确定所述多个物体是否被移动。
  17. 一种电子设备,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器,
    其中,所述处理器用于运行所述计算机程序时,执行权利要求1至15任一项所述方法的步骤。
  18. 一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1至15任一项所述方法的步骤。
PCT/CN2023/073710 2022-04-02 2023-01-29 图像处理方法、装置、电子设备及存储介质 WO2023185234A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210350472.4 2022-04-02
CN202210350472.4A CN114708291A (zh) 2022-04-02 2022-04-02 图像处理方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023185234A1 true WO2023185234A1 (zh) 2023-10-05

Family

ID=82173617

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/073710 WO2023185234A1 (zh) 2022-04-02 2023-01-29 图像处理方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN114708291A (zh)
WO (1) WO2023185234A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114708291A (zh) * 2022-04-02 2022-07-05 北京京东乾石科技有限公司 图像处理方法、装置、电子设备及存储介质
CN115239692B (zh) * 2022-08-12 2023-06-27 广东科学技术职业学院 一种基于图像识别技术的电子元器件检测方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9465994B1 (en) * 2015-02-23 2016-10-11 Amazon Technologies, Inc. Predicting performance and success of large-scale vision algorithms
CN109544631A (zh) * 2019-01-03 2019-03-29 银河航天(北京)科技有限公司 一种货物输送设备运行状态的检测系统与方法
CN109961101A (zh) * 2019-03-29 2019-07-02 京东方科技集团股份有限公司 货架状态确定方法及装置、电子设备、存储介质
CN111369529A (zh) * 2020-03-04 2020-07-03 厦门脉视数字技术有限公司 一种物品丢失、遗留检测方法及其系统
CN113052838A (zh) * 2021-04-26 2021-06-29 拉扎斯网络科技(上海)有限公司 置物检测方法、装置以及智能柜
CN114708291A (zh) * 2022-04-02 2022-07-05 北京京东乾石科技有限公司 图像处理方法、装置、电子设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9465994B1 (en) * 2015-02-23 2016-10-11 Amazon Technologies, Inc. Predicting performance and success of large-scale vision algorithms
CN109544631A (zh) * 2019-01-03 2019-03-29 银河航天(北京)科技有限公司 一种货物输送设备运行状态的检测系统与方法
CN109961101A (zh) * 2019-03-29 2019-07-02 京东方科技集团股份有限公司 货架状态确定方法及装置、电子设备、存储介质
CN111369529A (zh) * 2020-03-04 2020-07-03 厦门脉视数字技术有限公司 一种物品丢失、遗留检测方法及其系统
CN113052838A (zh) * 2021-04-26 2021-06-29 拉扎斯网络科技(上海)有限公司 置物检测方法、装置以及智能柜
CN114708291A (zh) * 2022-04-02 2022-07-05 北京京东乾石科技有限公司 图像处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114708291A (zh) 2022-07-05

Similar Documents

Publication Publication Date Title
WO2023185234A1 (zh) 图像处理方法、装置、电子设备及存储介质
CN110414507B (zh) 车牌识别方法、装置、计算机设备和存储介质
Papazov et al. An efficient ransac for 3d object recognition in noisy and occluded scenes
US20200019760A1 (en) Three-dimensional living-body face detection method, face authentication recognition method, and apparatuses
CN105809651B (zh) 基于边缘非相似性对比的图像显著性检测方法
CN108986152B (zh) 一种基于差分图像的异物检测方法及装置
CN112102340B (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
CN111626295B (zh) 车牌检测模型的训练方法和装置
CN110415208A (zh) 一种自适应目标检测方法及其装置、设备、存储介质
CN112336342B (zh) 手部关键点检测方法、装置及终端设备
CN112651953A (zh) 图片相似度计算方法、装置、计算机设备及存储介质
CN115631112B (zh) 一种基于深度学习的建筑轮廓矫正方法及装置
CN112991349B (zh) 图像处理方法、装置、设备和存储介质
CN110288040B (zh) 一种基于拓扑验证的图像相似评判方法及设备
CN114155285B (zh) 基于灰度直方图的图像配准方法
CN113228105A (zh) 一种图像处理方法、装置和电子设备
CN110929738A (zh) 证卡边缘检测方法、装置、设备及可读存储介质
CN111354038A (zh) 锚定物检测方法及装置、电子设备及存储介质
CN117372487A (zh) 图像配准方法、装置、计算机设备和存储介质
US7440636B2 (en) Method and apparatus for image processing
US7231086B2 (en) Knowledge-based hierarchical method for detecting regions of interest
CN112287905A (zh) 车辆损伤识别方法、装置、设备及存储介质
CN111680680A (zh) 一种目标码定位方法、装置、电子设备及存储介质
CN112101139B (zh) 人形检测方法、装置、设备及存储介质
CN114004839A (zh) 全景图像的图像分割方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23777613

Country of ref document: EP

Kind code of ref document: A1