CN114913233A

CN114913233A - Image processing method, apparatus, device, medium, and product

Info

Publication number: CN114913233A
Application number: CN202210655684.3A
Authority: CN
Inventors: 吴新涛
Original assignee: Petromentor International Education Beijing Co ltd
Current assignee: Petromentor International Education Beijing Co ltd
Priority date: 2022-06-10
Filing date: 2022-06-10
Publication date: 2022-08-16

Abstract

An embodiment of the application provides an image processing method, an image processing device, an image processing apparatus, a medium and a product, wherein the image processing method comprises the following steps: acquiring an image to be processed in real time; extracting image features of an image to be processed based on a first network of an image processing model, and determining first position information of a target object boundary frame and second position information of a ship row boundary frame based on the image features; respectively carrying out classification calculation on the first position information and the second position information through a second network of the image processing model to obtain the category of the target object boundary box and the category of the ship row boundary box; and sending the image to be processed to an alarm platform for the alarm platform to generate alarm information under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is less than a preset threshold value. According to the embodiment of the application, the condition that personnel on the ship do not wear the life jacket and approach the ship row can be monitored in time.

Description

Image processing method, apparatus, device, medium, and product

Technical Field

The present application relates to the field of object detection technologies, and in particular, to an image processing method, apparatus, device, medium, and product.

Background

In real life, in some specific geographic environments, people need to take a ship for going out or depend on the ship for some operations. When a person goes out or works on a ship, the person or the operator approaches the ship row and falls into water due to the fact that the person or the operator does not wear a life jacket. Aiming at the situation, in the prior art, people on the ship are generally reminded to wear the life jacket and do not approach the ship row during the navigation or operation of the ship so as to avoid the situation that people on the ship fall into the water due to the fact that related people do not wear the life jacket and approach the ship row, but the method cannot timely monitor the situation that the people on the ship do not wear the life jacket and approach the ship row, and further cannot guarantee the safety of the people on the ship.

Disclosure of Invention

The embodiment of the application provides an image processing method, device, equipment, medium and product, which can timely monitor the condition that personnel on a ship do not wear a life jacket and approach a ship raft.

In a first aspect, an embodiment of the present application provides an image processing method, where the method includes:

acquiring an image to be processed in real time, wherein the image to be processed comprises a ship row image;

extracting image features of an image to be processed based on a first network of an image processing model, and determining first position information of a target object boundary frame and second position information of a ship row boundary frame based on the image features, wherein the image features comprise features of the target object and ship row features in the image to be processed;

respectively carrying out classification calculation on the first position information and the second position information through a second network of the image processing model to obtain the category of the target object boundary box and the category of the ship row boundary box;

and sending the image to be processed to an alarm platform for the alarm platform to generate alarm information under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is less than a preset threshold value.

In an optional implementation manner of the first aspect, the classifying and calculating the first location information and the second location information through a second network of the image processing model to obtain the category of the target object bounding box and the category of the ship row bounding box includes:

cutting the image to be processed based on the first position information to obtain a target object image, and cutting the image to be processed based on the second position information to obtain a ship row image;

and respectively carrying out classification calculation on the target object image and the ship row image through a second network of the image processing model to obtain the category of the target object boundary frame and the category of the ship row boundary frame.

In an optional implementation manner of the first aspect, before extracting image features of the image to be processed based on the first network of the image processing model, and determining the first position information of the target object bounding box and the second position information of the ship row bounding box based on the image features, the method further includes:

acquiring a training sample set, wherein the training sample set comprises a plurality of image samples to be processed and a first label category and a second label category corresponding to each image sample;

and training a preset image processing model by using the image samples to be processed in the training sample set and the first label type and the second label type corresponding to each image sample to obtain the trained image processing model.

In an optional implementation manner of the first aspect, training a preset image processing model by using to-be-processed image samples in a training sample set and a first label class and a second label class corresponding to each image sample to obtain a trained image processing model, includes:

extracting reference image features in an image sample to be processed based on a first network of a preset image processing model, and determining first reference position information of an object reference boundary frame and second reference position information of a ship row reference boundary frame based on the reference image features, wherein the reference image features comprise features of objects in the image sample to be processed and features of ship rows;

performing classification calculation based on the first reference position information and the second reference position information through a second network of a preset image processing model respectively, and determining a reference category of the object reference bounding box and a reference category of the ship row reference bounding box;

determining a loss function value of a preset image processing model according to a first label category of a target image sample to be processed and a reference category of an object reference boundary frame, and a second label category of the target image sample to be processed and a reference category of a ship row reference boundary frame, wherein the target image sample to be processed is any one of the image samples to be processed;

and training the preset image processing model by using the image sample to be processed based on the loss function value of the preset image processing model to obtain the trained image processing model.

In an optional implementation of the first aspect, before obtaining the training sample set, the method further comprises:

acquiring a plurality of original images, wherein the original images comprise ship row images;

and according to a preset data enhancement strategy, performing data enhancement processing on the plurality of original images to obtain a plurality of to-be-processed image samples corresponding to each original image.

In a second aspect, an embodiment of the present application provides an image processing apparatus, including:

the acquisition module is used for acquiring images to be processed in real time, wherein the images to be processed comprise ship row images;

the determining module is used for extracting image features of the image to be processed based on a first network of the image processing model, and determining first position information of a target object boundary box and second position information of a ship row boundary box based on the image features, wherein the image features comprise the features of the target object in the image to be processed and the ship row features;

the classification calculation module is used for performing classification calculation on the first position information and the second position information through a second network of the image processing model to obtain the class of the target object boundary box and the class of the ship row boundary box;

and the sending module is used for sending the image to be processed to the alarm platform under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is smaller than a preset threshold value so as to be used for generating alarm information by the alarm platform.

In an optional implementation manner of the second aspect, the obtaining module is configured to crop the image to be processed based on the first position information, obtain an image of the target object, crop the image to be processed based on the second position information, and obtain the image of the ship row;

and the classification calculation module is used for performing classification calculation on the target object image and the ship row image through a second network of the image processing model to obtain the class of the target object boundary frame and the class of the ship row boundary frame.

In a third aspect, an electronic device is provided, including: a memory for storing computer program instructions; a processor, configured to read and execute the computer program instructions stored in the memory, so as to execute the image processing method provided in any optional implementation manner of the first aspect.

In a fourth aspect, a computer storage medium is provided, on which computer program instructions are stored, and the computer program instructions, when executed by a processor, implement the image processing method provided in any optional implementation manner of the first aspect.

In a fifth aspect, a computer program product is provided, and instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to execute an image processing method provided in any optional implementation manner of the first aspect.

In the embodiment of the application, after the image to be processed is obtained in real time, the feature and the ship row feature of the target object in the image to be processed are extracted based on the first network of the image processing model, the first position information of the target object boundary box and the second position information of the ship row boundary box are determined according to the feature and the ship row feature of the target object in the image to be processed, and then the first position information and the second position information can be classified and calculated through the second network of the image processing model respectively to obtain the category of the target object boundary box and the category of the ship row boundary box. Therefore, the image to be processed can be sent to the alarm platform to be used for the alarm platform to generate the alarm information under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet the preset conditions and the distance between the first position information and the second position information is smaller than the preset threshold value. Therefore, the condition that personnel on the ship do not wear the life jacket and approach the ship row can be monitored in time, and the safety of the personnel on the ship is further ensured.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a training flow of an image processing model in an image processing method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a training flow of an image processing model in another image processing method provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of a feature pyramid provided in an embodiment of the present application;

fig. 4 is a schematic structural diagram of a pyramid of path aggregation features according to an embodiment of the present disclosure;

fig. 5 is a schematic flowchart of an image processing method according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative only and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone.

In real life, in order to solve the problem that people on a ship fall into water due to the fact that people on the ship do not wear life jackets and are close to a ship row in the prior art, embodiments of the present application provide an image processing method, an apparatus, a device, a medium, and a product, where the method can extract features of a target object and ship row features in an image to be processed based on a first network of an image processing model after the image to be processed is obtained in real time, determine first position information of a target object boundary box and second position information of a ship row boundary box through the features of the target object and the ship row features in the image to be processed, and further perform classification calculation on the first position information and the second position information through a second network of the image processing model to obtain a category of the target object boundary box and a category of the ship row boundary box. Therefore, the image to be processed can be sent to the alarm platform to be used for the alarm platform to generate the alarm information under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet the preset conditions and the distance between the first position information and the second position information is smaller than the preset threshold value. Therefore, the condition that personnel on the ship do not wear the life jacket and approach the ship row can be monitored in time, and the safety of the personnel on the ship is further ensured.

In the image processing method provided by the embodiment of the application, the execution subject can be an image processing device or a control module used for executing the image processing method in the image processing device. In the embodiment of the present application, an image processing method executed by an image processing apparatus is taken as an example to describe an image processing scheme provided in the embodiment of the present application.

In addition, the image processing apparatus according to the embodiment of the present application needs to process an image to be processed by using a pre-trained image processing model. Therefore, before image processing is performed by the image processing apparatus, the image processing apparatus needs to be trained.

Based on this, the image processing method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings by specific embodiments.

The embodiment of the application provides a training method of an image processing model, an execution subject of the method can be an image processing device, and the method can be specifically realized through the following steps:

firstly, acquiring a training sample set

The training sample set may include a plurality of image samples to be processed and a first label class and a second label class corresponding to each image sample to be processed. Wherein the image sample to be processed may include a ship row image sample, the first label category may be "the object wears the life jacket" or "the object does not wear the life jacket", and the second label category may be "the ship row" or "the non-ship row".

In order to obtain a more accurate training sample and further better train the image processing model, in an embodiment, as shown in fig. 1, the obtaining of the training sample set specifically includes the following steps:

s110, a plurality of to-be-processed image samples are obtained.

Specifically, the image processing apparatus may read, from the database, a plurality of to-be-processed image samples captured by the monitoring device within a preset time period. The preset time period may be a preset time period based on actual needs or experience, for example, the preset time period may be a day or a week, and is not limited specifically herein.

Specifically, the image processing apparatus may obtain a frame of image sample including a ship row from a video every N frames in a grouping and frame-extracting manner by using a video around the ship row in a preset time period obtained by the monitoring device, where N is a positive integer, and the monitoring device may be a monitoring device installed on a telegraph pole or a street lamp on the site, and may be configured to obtain the ship row image sample. In addition, the horizontal distance between the monitoring device and the ship row can be controlled within a preset distance, for example, within 100 meters, which is not limited herein.

And S120, marking a first label category and a second label category which correspond to the plurality of data samples to be processed one by one.

Specifically, the first label category and the second label category corresponding to each to-be-processed image sample may be labeled in a manual labeling manner, or the first label category and the second label category corresponding to each to-be-processed image sample may be directly labeled by the image processing apparatus, and the specific labeling manner is not specifically limited herein.

It should be noted that, in the process of labeling, 75% of labeled sample data may be used as a training sample, 25% of labeled sample data may be used as a test sample, and the specific distribution ratio of the training sample and the test sample is not limited herein.

It should be noted that, since the image processing model needs to perform iterative processing for multiple times to adjust the loss function value thereof until the loss function value satisfies the training stop condition, the trained image processing apparatus is obtained. However, in each iterative training, if only one to-be-processed image sample is input, the amount of the sample is too small to facilitate the training adjustment of the image processing model. Therefore, the training sample set needs to be divided into a plurality of to-be-processed image samples, so that the to-be-processed image samples in the training sample set can be used for performing iterative processing on the image processing model.

Therefore, the acquired image samples to be processed can be annotated to obtain a first label category and a second label category which are in one-to-one correspondence with the plurality of image samples to be processed, and then a training sample set containing the plurality of image samples to be processed can be obtained. Therefore, the training of the subsequent model is facilitated.

And secondly, training a preset image processing model by using the to-be-processed image samples in the training sample set and the first label class and the second label class corresponding to each to-be-processed image sample to obtain the trained image processing model.

As shown in fig. 2, the step may specifically include the following steps:

s210, extracting reference image features in the image sample to be processed based on a first network of a preset image processing model, and determining first reference position information of the object reference boundary frame and second reference position information of the ship row reference boundary frame based on the reference image features.

The first network may be a YOLOv5 network, and the specific use of the first network is not limited herein. In addition, the YOLOv5 network model in the embodiment of the present application may adopt 8-fold, 16-fold, or 32-fold downsampling feature maps.

The reference image features may include features of an object in the image sample to be processed and features of a ship row, accordingly, the image sample framed by the object reference bounding box may include the object image sample, and the first reference position information may be position coordinates of a top left pixel vertex and a bottom right pixel vertex of the object reference bounding box in the image to be processed, which is not limited herein. The image samples framed by the ship's bank reference bounding box may include ship's bank image samples, and the second reference bounding box may be position coordinates of a top left pixel vertex and a bottom right pixel vertex of the ship's bank reference bounding box in the image to be processed, which is not limited herein.

Specifically, after the training sample set is obtained, the image processing apparatus may input the image sample to be processed in the training sample set into a first network in a preset image processing model, and determine first reference position information of the object reference bounding box and second reference position information of the ship row bounding box based on the features of the object and the features of the ship row in the image sample to be processed extracted by the first network.

In one example, after inputting the image sample to be processed into the first network of the preset image model, the first network may output first reference position information of the object reference bounding box with a first confidence greater than a first confidence threshold, and first reference position information of the ship row reference bounding box with a second confidence greater than a second confidence threshold. The first confidence may be regarded as a probability and may be used to characterize the reference class of the object reference bounding box, for example, the first confidence may be characterized as a probability that the reference class of the object reference bounding box is "the object wears a life jacket". The second confidence level is similar to the first confidence level and is not overly limited herein. The first confidence threshold and the second confidence threshold may be thresholds preset empirically, and may be the same or different.

Based on this, it should be noted that the feature pyramid is a method commonly used in the field of target detection, and a specific structure of the feature pyramid may be as shown in fig. 3, where the feature semantic information of the lower layer in the structure is less, but the target position is accurate; the feature semantic information of the high layer is rich, but the target position is rough. The feature pyramid can fuse features of different sizes and be performed independently at different feature levels for different sized objects.

On this basis, the method and the device use a pyramid structure of the path aggregation features, and the specific structure can be as shown in fig. 4, so that the multi-scale of the image features is realized, and the detection performance of the target detection, especially the detection performance of the small target, is further improved. Firstly, introducing bottom-up path enhancement, shortening an information propagation path, and simultaneously utilizing accurate positioning information of low-layer features; secondly, dynamic feature pooling is used, and features of all layers of a pyramid are utilized in each proposed area to avoid random distribution of the proposed areas; and thirdly, the full-connection layer fusion is used, so that the capture capacity of the model to information with different scales is improved.

And S220, performing classification calculation through a second network of the preset image processing model respectively based on the first reference position information and the second reference position information, and determining the reference category of the object reference boundary frame and the reference category of the ship row reference boundary frame.

The second network may be MobileNetV2, or may be other networks for classification, and the specific network used is not specifically limited herein. The reference category of the object reference bounding box can be two categories of 'the object wearing life jacket' or 'the object not wearing life jacket', and the reference category of the ship row reference bounding box can be two categories of 'ship row' and 'non-ship row'.

Specifically, the image processing apparatus may perform classification calculation on the first reference position information and the second reference position information through a second network of a preset image processing model, respectively, after acquiring the first reference position information and the second reference position information, to determine a reference class of the object reference bounding box and a reference class of the fleet reference bounding box.

It should be noted that, because the scene difference degree is high, the category similarity degree is high, and the extremely large target is difficult to detect, the first network has a certain short board for the classification performance of the target, so that a certain false report and false report are correspondingly generated. So in view of the lack of classification capability in the first network, the present application proposes to use an additional classifier, namely MobileNetV2, to improve classification performance.

Specifically, aiming at the problem of high class similarity, in the training stage, a target area is cut from an image sample to be processed according to a truth value label, wherein the truth value can be the label of a training image and is an ideal result output by a model, and then a first reference boundary box is converted into the size of 224 multiplied by 224 in a zero filling mode and is sent to the MobileNet V2 for training; in the inference stage, an object reference boundary frame and a ship-row reference boundary frame are respectively cut out from the image sample to be processed according to the first reference position information and the second reference position information, and the first reference boundary frame is transformed to the size of 224 multiplied by 224 in a zero filling mode to obtain a reference category of the object reference boundary frame and a reference category of the ship-row reference boundary frame.

Aiming at the problems of difficulty in detecting a great target and high scene difference degree, the target detection problem is converted into a classification problem, and the positioning problem which is difficult to complete is decoupled, so that the correct judgment of the scene (a ship raft and a water-leaning area) is realized. In the training stage, zooming an acquired image sample to be processed until the short edge is 224, randomly cutting a bounding box image with the size of 224 multiplied by 224 in the zoomed image sample to be processed, judging whether the image sample is a working scene, marking labels as 'steak' and 'non-steak', and sending the image sample to MobileNetV2 for training; in the inference stage, if the detection result does not have the ship row target, zooming and random cutting are carried out in the same way, and the result is sent to a MobileNet V2 for scene judgment.

The related position information may be pixel positions of the first reference position information and the second reference position information in the image sample to be processed.

And S230, determining a loss function value of a preset image processing model according to the first label category of the target image sample to be processed and the reference category of the object reference boundary box, and the second label category of the target image sample to be processed and the reference category of the ship row reference boundary box.

The target image sample to be processed is any one of the image samples to be processed.

Specifically, the image processing device may further accurately determine the loss function value of the preset image processing device based on the reference category of the object reference bounding box and the reference category of the ship row reference bounding box obtained from any one of the multiple image samples to be processed, and the first label category and the second label category corresponding to the image sample to be processed, so as to perform iterative training on the preset image processing model based on the loss function value, and further obtain a more accurate image processing model.

S240, training the preset image processing model by using the image sample to be processed based on the loss function value of the preset image processing model to obtain the trained image processing model.

Specifically, in order to obtain a better trained image processing model, under the condition that the loss function value does not meet the training stop condition, the model parameters of the preset image processing model are adjusted, and the image processing model after parameter adjustment is continuously trained by using the image sample to be processed until the loss function value meets the training stop condition, so that the trained image processing model is obtained.

In this embodiment, after the training sample set is obtained, the image processing apparatus may input any image sample to be processed included in the training sample set into a first network of a preset image processing model, extract a reference image feature in the image sample to be processed through the first network, determine first reference position information of the object reference bounding box and second reference position information of the ship-row reference bounding box based on the reference image feature, and further perform classification calculation on the first reference position information and the second reference position information through a second network of the preset image processing model, so as to determine a reference category of the object reference bounding box and a reference category of the ship-row reference bounding box. Based on the method, the loss function value of the preset image processing model can be determined based on the first label type and the second label type corresponding to the image sample to be processed, and then the preset image processing model can be trained by utilizing the image sample to be processed based on the loss function value until the loss function value meets the training stopping condition, so that a more accurate image processing model can be obtained.

Due to the fact that the field operation environment is complex (part of operation scenes only have water surfaces but no ship rows), and a long time span exists in the detection process, the variability and complexity of the data domain provide great challenges for the precision and generalization performance of a target detection algorithm, and provide extremely high requirements for scene judgment.

Besides, the ship rows belong to a maximum target and an occlusion target, and the detection of the maximum target and the occlusion target is always a difficult point and a pain point in the field of target detection. And in some scenes, the personnel marking frame is relatively small, which belongs to the problem of small target detection, and the small target detection is also a difficult point and a pain point in the field of target detection. In addition, the two categories of targets of 'the object wears the life jacket' and 'the object does not wear the life jacket' are similar in height, and the difference mainly appears in detail features with fine granularity, so that the situation of classification errors is easy to occur.

Based on this, in order to determine the category of the object reference bounding box and the category of the ship row reference bounding box more accurately and facilitate obtaining a more accurate image processing model subsequently, in an embodiment, the above-mentioned S220 may include the following steps:

and clipping the image sample to be processed based on the first reference position information of the object reference boundary frame to obtain an object reference image, and clipping the image sample to be processed based on the second reference position information of the ship-row reference boundary frame to obtain a ship-row reference image.

And respectively carrying out classification calculation on the object reference image and the ship row reference image through a second network of the image processing model to obtain a reference category of the object reference image and a reference category of the ship row reference image.

In this embodiment, the image processing apparatus may obtain the bank reference image by cropping the to-be-processed image sample based on the first reference position information of the object reference bounding box, and crop the to-be-processed image sample based on the second reference position information of the bank reference bounding box. And then, the object reference image and the ship-row reference image can be classified and calculated through a second network of the image processing model, so that the reference category of the object reference image and the reference category of the ship-row reference image are obtained. Therefore, the classification accuracy of the preset image processing model can be improved, and the more accurate image processing model can be obtained conveniently.

In addition, the image characteristics of the image sample to be processed acquired by the monitoring equipment greatly change along with the influence of weather. Secondly, due to the complex operation scene, it is difficult to judge whether the object in the operation scene has the danger of falling into water only by the target detection mode. Finally, the distortion caused by the fish-eye effect of the monitoring equipment and the depth of the image caused by the wide coverage of the camera are difficult to accurately judge the distance between a person and the ship row. And the single picture of the image characteristic is not beneficial to the training of the model.

Based on this, in order to obtain a more accurate image processing model, a large number of diverse image samples to be processed need to be acquired to train the image processing model. Therefore, in an embodiment, before the training sample set is acquired, the above-mentioned image processing method may further include:

acquiring a plurality of original images;

Specifically, the image processing apparatus may obtain a plurality of original images before obtaining the training sample set, and perform data enhancement processing on each original image according to a preset data enhancement policy to obtain a plurality of to-be-processed image samples corresponding to each original image.

Wherein, the original image comprises the ship row image. The preset data enhancement strategy can be a preset strategy based on actual needs or experience, and is used for enhancing the image. And each preset data enhancement strategy can comprise two data enhancement operations, and each data enhancement operation can comprise the probability of using the operation and the magnitude related to the operation, so that the optimal combination of the data enhancement strategies can be obtained by using reinforcement learning in the search space of the data enhancement strategies formed by the operation. Note that, the probability of 0 or the intensity of 0 indicates that the enhancement operation is not used.

In addition, it should be noted that the preset data enhancement policy referred in the embodiment of the present application may be five data enhancement policies, namely, translate x _ BBox and equal size, translate Y _ Only _ BBoxes and equal size, translate x _ BBoxes and equal size, translate Y _ Only _ BBoxes, Rotate _ BBoxes and equal size, selected from enhancement processing operations of shear x/Y, TranslateX/Y, Rotate, autocontrost, Invert, equal size, shape, etc.

Wherein, TranslateX _ BBox: and translating the truth value labeling box and the original image. Equalize: histogram equalization is performed for each channel. TranslateY _ Only _ BBodes: and carrying out random translation on the truth labeling box. Cutout: and deleting a part of rectangular areas in the image. Sharpness: and carrying out image sharpening. ShearX _ Bbox: and carrying out cross cutting on the image and the truth value box. Rotate _ BBox: the image and the truth box are rotated. Color: the image is color transformed.

Specific examples can be shown in table 1:

TABLE 1 data enhancement strategy

Besides, the number of samples of each original image can be increased by copying.

In this embodiment, the image processing apparatus may acquire a plurality of original images before acquiring the training sample set, and perform data enhancement processing on each of the plurality of original images according to a preset data enhancement policy to obtain a plurality of to-be-processed image samples corresponding to each of the plurality of original images. Therefore, the situation that the image features are single is avoided, and then a large number of to-be-processed image samples can be obtained, so that a more accurate image processing model can be trained conveniently.

Based on the image processing model obtained through training in the foregoing embodiment, the embodiment of the present application further provides a specific implementation of an image processing method, which is specifically described in detail with reference to fig. 5.

And S510, acquiring the image to be processed in real time.

Specifically, the image processing apparatus may acquire one frame of image as an image to be processed every N frames according to a monitoring video acquired by the monitoring device in real time. Where N is a positive integer, for example, N may be 45. The monitoring device may be a monitoring device installed on a utility pole or a street lamp near a ship, and can acquire images near a ship raft. And, the horizontal distance between the monitoring device and the ship row position can be controlled within a preset distance, for example, can be controlled within 100 meters, and is not specifically limited herein.

S520, extracting image features of the image to be processed based on the first network of the image processing model, and determining first position information of the target object boundary box and second position information of the ship row boundary box based on the image features.

The image features comprise features of the target object in the image to be processed and the features of the ship row.

The image framed by the target object bounding box includes a target object image, and correspondingly, the first position information may be position coordinates of an upper left pixel vertex and a lower right pixel vertex of the target object image in the image to be processed, or position coordinates of an upper right pixel vertex and a lower left pixel vertex of the target object image in the image to be processed, and the specific position information may be determined according to an actual situation, which is not limited herein.

The image framed by the ship row bounding box includes a ship row image, and the second position information is similar to the first position information, which is not described in detail herein.

The image processing device may extract the feature of the target object and the ship row feature in the image to be processed based on the first network of the image processing model after acquiring the image to be processed, and may determine first position information of the target object bounding box and second position information of the ship row bounding box based on the feature of the target object and the ship row feature in the image to be processed.

S530, the first position information and the second position information are classified and calculated through a second network of the image processing model, and the category of the target object boundary box and the category of the ship row boundary box are obtained.

The category of the target object bounding box can be that the target object wears life jackets and the target object does not wear life jackets, and the category of the ship row bounding box can be that the ship row and the non-ship row.

Specifically, after obtaining the first position information and the second position information, the image processing apparatus may perform classification calculation on the first position information and the second position information through a second network of the image processing model, respectively, to obtain a category of the target object bounding box and a category of the ship row bounding box.

And S540, sending the image to be processed to the alarm platform for the alarm platform to generate alarm information under the condition that the type of the target object boundary frame and the type of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is smaller than a preset threshold value.

The preset conditions comprise that the type of the target object boundary box is 'the target object does not wear the life jacket', and the type of the ship row boundary box is 'the ship row'. The preset threshold may be based on practical experience or a distance threshold that needs to be preset, and is not limited herein. The alarm information is used for reminding the target object of dangerous information.

Specifically, the image processing device may send the image to be processed to the alarm platform when the category of the target object bounding box is "the target object does not wear a life jacket" and the category of the ship row bounding box is "the ship row", and the calculated distance between the first position information and the second position information is less than a preset threshold value, so that the alarm platform may generate related alarm information to remind the target object of danger.

In order to more accurately obtain the category of the target bounding box and the category of the ship row bounding box, in an embodiment, the above-mentioned step S530 may include the following steps:

In the embodiment of the present application, the image processing apparatus may crop the image to be processed based on the obtained first position information to obtain the target object image, and may also crop the image to be processed based on the second position information to obtain the steak image. And then, the cut target object image and the ship row image can be classified and calculated through a second network of the image processing model, so that the type of the target object boundary frame and the type of the ship row boundary frame are obtained. Therefore, the image to be processed can be cut, the cut image is processed through the image processing model, the type of the boundary frame of the target object and the type of the boundary frame of the ship row can be accurately obtained, the situation that personnel on the ship do not wear life jackets and are close to the ship row can be conveniently monitored in a follow-up and timely judging mode, and the safety of the personnel on the ship can be guaranteed.

The image processing method provided by the embodiment of the application can realize real-time detection on whether personnel close to a ship raft do not wear the life jacket, not only meets the management requirements, but also can timely detect potential safety hazards and send out early warning, so that accidents are avoided. Based on the method, the collected scene data set shot by the user can be adopted for training and testing under the conditions of an Intercore i7 CPU, a 4G memory and an NVIDIA GeForce 2080Ti independent display card. The training of the model is loaded in a weight file after ImageNet pre-training and iterated for 70 times, wherein the weight file comprises 1-3 test scenes in the morning, noon, afternoon and evening, enough image tests are obtained, and the specific results are shown in table 2:

TABLE 2 statistical table of test results

Type of algorithm	Number of test scenarios	Video frame number	Correct recognition	Error recognition	Accuracy rate
						Life jacket not worn by ship	1	6252	6013	239	96.1％
Life jacket not worn by ship	1	5874	5592	282	95.1％
						Life jacket not worn by ship	1	6102	5846	256	95.8％
Life jacket not worn by ship	1	5691	5441	250	95.6％

Based on the same inventive concept, the embodiment of the application also provides an image processing device. The image processing apparatus provided in the embodiment of the present application is specifically described with reference to fig. 6.

Fig. 6 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.

As shown in fig. 6, the image processing apparatus 600 may include: an acquisition module 610, a determination module 620, a classification calculation module 630, and a sending module 640.

The acquiring module 610 is configured to acquire an image to be processed in real time, where the image to be processed includes a ship row image.

And the determining module 620 is configured to extract image features of the image to be processed based on the first network of the image processing model, and determine first position information of the target object bounding box and second position information of the ship row bounding box based on the image features, where the image features include features of the target object in the image to be processed and ship row features.

The classification calculation module 630 is configured to perform classification calculation on the first location information and the second location information through a second network of the image processing model, respectively, to obtain a class of the target object bounding box and a class of the ship row bounding box.

And the sending module 640 is configured to send the image to be processed to the alarm platform for the alarm platform to generate alarm information when the category of the target object boundary box and the category of the ship row boundary box meet preset conditions and the distance between the first position information and the second position information is smaller than a preset threshold.

In one embodiment, the obtaining module is further configured to crop the image to be processed based on the first position information, obtain an image of the target object, crop the image to be processed based on the second position information, and obtain the image of the ship row.

In one embodiment, the obtaining module is further configured to obtain a training sample set before extracting image features of the image to be processed based on the first network of the image processing model and determining the first position information of the target object bounding box and the second position information of the ship row bounding box based on the image features, where the training sample set includes a plurality of image samples to be processed and a first tag class and a second tag class corresponding to each image sample.

The image processing apparatus as referred to above may further comprise a training module.

And the training module is used for training a preset image processing model by utilizing the to-be-processed image samples in the training sample set and the first label class and the second label class corresponding to each image sample to obtain the trained image processing model.

In one embodiment, the training module mentioned above is specifically configured to:

extracting reference image features in an image sample to be processed based on a first network of a preset image processing model, and determining first reference position information of an object reference boundary frame and second reference position information of a ship row reference boundary frame based on the reference image features, wherein the reference image features comprise the features of objects in the image sample to be processed and the features of ship rows;

performing classification calculation based on the first reference position information and the second reference position information through a second network of a preset image processing model respectively, and determining a reference category of the object reference boundary frame and a reference category of the ship row reference boundary frame;

In an embodiment, the above-mentioned acquisition module is further configured to acquire a plurality of raw images before acquiring the training sample set, wherein the raw images include a ship row image.

The image processing apparatus referred to above may further include a data enhancement module.

And the data enhancement module is used for performing data enhancement processing on the plurality of original images according to a preset data enhancement strategy so as to obtain a plurality of to-be-processed image samples corresponding to each original image.

In the embodiment of the present application, the image processing apparatus may crop the image to be processed based on the obtained first position information to obtain the target object image, and may also crop the image to be processed based on the second position information to obtain the ship row image. And then, the cut target object image and the ship row image can be classified and calculated through a second network of the image processing model, so that the type of the target object boundary frame and the type of the ship row boundary frame are obtained. Therefore, the image to be processed can be cut, the cut image is processed through the image processing model, the type of the boundary frame of the target object and the type of the boundary frame of the ship row can be accurately obtained, the situation that personnel on the ship do not wear life jackets and are close to the ship row can be conveniently monitored in a follow-up and timely judging mode, and the safety of the personnel on the ship can be guaranteed.

Each module in the image processing apparatus provided in the embodiment of the present application may implement the method steps in the embodiments shown in fig. 1, fig. 2, or fig. 5, and achieve the corresponding technical effects, and for brevity, no further description is given here.

Fig. 7 shows a hardware structure diagram of an electronic device provided in an embodiment of the present application.

The electronic device may include a processor 701 and a memory 702 in which computer program instructions are stored.

Specifically, the processor 701 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 702 may include a mass storage for data or instructions. By way of example, and not limitation, memory 702 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 702 may include removable or non-removable (or fixed) media, where appropriate. The memory 702 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 702 is non-volatile solid-state memory.

The memory may include Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors), it is operable to perform operations described with reference to the methods according to an aspect of the present disclosure.

The processor 701 realizes any one of the image processing methods in the above embodiments by reading and executing computer program instructions stored in the memory 702.

In one example, the electronic device may also include a communication interface 703 and a bus 710. As shown in fig. 7, the processor 701, the memory 702, and the communication interface 703 are connected by a bus 710 to complete mutual communication.

The communication interface 703 is mainly used for implementing communication between modules, apparatuses, units and/or devices in this embodiment of the application.

Bus 710 comprises hardware, software, or both to couple the components of the online data traffic billing device to each other. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 710 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

In addition, in combination with the image processing method in the foregoing embodiment, the embodiment of the present application may be implemented by providing a computer storage medium. The computer storage medium having computer program instructions stored thereon; the computer program instructions realize the image processing method provided by the embodiment of the application when being executed by a processor.

The embodiment of the present application further provides a computer program product, and when an instruction in the computer program product is executed by a processor of an electronic device, the electronic device executes the scientific and technological innovation achievement evaluation method provided in the embodiment of the present application.

It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.

The functional blocks shown in the above structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.

Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable image processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable image processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As will be apparent to those skilled in the art, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims

1. An image processing method, characterized in that the method comprises:

extracting image features of the image to be processed based on a first network of an image processing model, and determining first position information of a target object boundary box and second position information of a ship row boundary box based on the image features, wherein the image features comprise features of a target object in the image to be processed and ship row features;

respectively carrying out classification calculation on the first position information and the second position information through a second network of an image processing model to obtain the category of a target object boundary box and the category of a ship row boundary box;

and sending the image to be processed to an alarm platform for the alarm platform to generate alarm information under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is smaller than a preset threshold value.

2. The method of claim 1, wherein the classifying and calculating the first location information and the second location information through the second network of the image processing model to obtain the class of the bounding box of the target object and the class of the bounding box of the ship row comprises:

3. The method of claim 1, wherein before the first network based on the image processing model extracts image features of the image to be processed and determines first position information of a bounding box of the target object and second position information of a bounding box of a ship row based on the image features, the method further comprises:

and training a preset image processing model by using the to-be-processed image samples in the training sample set and the first label class and the second label class corresponding to each to-be-processed image sample to obtain a trained image processing model.

4. The method according to claim 3, wherein the training of a preset image processing model by using the image samples to be processed in the training sample set and the first label class and the second label class corresponding to each image sample to obtain the trained image processing model comprises:

extracting reference image features in the image sample to be processed based on a first network of a preset image processing model, and determining first reference position information of an object reference boundary frame and second reference position information of a ship row reference boundary frame based on the reference image features, wherein the reference image features comprise features of objects in the image sample to be processed and features of ship rows;

determining a loss function value of a preset image processing model according to a first label category of a target image sample to be processed and a reference category of the object reference bounding box, and a second label category of the target image sample to be processed and a reference category of the ship row reference bounding box, wherein the target image sample to be processed is any one of the image samples to be processed;

5. The method of claim 3 or 4, wherein prior to obtaining the training sample set, the method further comprises:

6. An image processing apparatus, characterized in that the apparatus comprises:

the determining module is used for extracting image features of the image to be processed based on a first network of an image processing model, and determining first position information of a target object boundary box and second position information of a ship row boundary box based on the image features, wherein the image features comprise features of a target object in the image to be processed and ship row features;

and the sending module is used for sending the image to be processed to an alarm platform under the condition that the category of the target object boundary frame and the category of the ship row boundary frame meet preset conditions and the distance between the first position information and the second position information is smaller than a preset threshold value, so that the alarm platform generates alarm information.

7. The apparatus of claim 6,

the acquisition module is used for cutting the image to be processed based on the first position information to acquire a target object image, and cutting the image to be processed based on the second position information to acquire a ship row image;

8. An electronic device, characterized in that the device comprises: a processor and a memory storing computer program instructions;

the processor reads and executes the computer program instructions to implement the image processing method of any one of claims 1 to 5.

9. A computer storage medium having computer program instructions stored thereon which, when executed by a processor, implement the image processing method of any one of claims 1 to 5.

10. A computer program product, wherein instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform the image processing method of any one of claims 1-5.