CN115482458A

CN115482458A - Image processing method, device, equipment and storage medium

Info

Publication number: CN115482458A
Application number: CN202110594742.1A
Authority: CN
Inventors: 白杰; 张光跃; 瞿德清; 邹帅帅
Original assignee: Changxin Memory Technologies Inc
Current assignee: Changxin Memory Technologies Inc
Priority date: 2021-05-28
Filing date: 2021-05-28
Publication date: 2022-12-16

Abstract

The application discloses an image processing method, an image processing device, an image processing apparatus and a storage medium, wherein the method comprises the following steps: adopting grids with various sizes to carry out segmentation processing on the collected image frames to obtain an image block set; carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object; splicing at least one target image block based on the position information of each target image block in an image frame to obtain an image to be analyzed containing a preset object; and performing behavior recognition on the image to be analyzed to obtain a recognition result. In the embodiment of the application, after the collected image frames are segmented by adopting the grids with multiple sizes, the obtained image blocks are subjected to object identification, and more accurate and detailed feature detection can be realized while considering the processing performance, so that the identification requirements of objects with various sizes are met, and the accuracy of behavior identification is improved.

Description

Image processing method, device, equipment and storage medium

Technical Field

The present application relates to, but not limited to, the field of computer vision, and in particular, to an image processing method, apparatus, device, and storage medium.

Background

With the development of computer vision technology, behavior recognition based on computer vision is also gradually widely used. For example, the behavior recognition is carried out on videos or images collected in daily life scenes (such as traffic intersections, living communities, subway lines and the like), production operation scenes (such as workshops, construction sites, high-risk operation sites and the like) and the like, so that the intelligent monitoring of corresponding scenes can be realized, and the life and property safety and the production operation safety of people can be guaranteed. However, the behavior recognition method in the related art still has the problem that the recognition accuracy is not high enough.

Disclosure of Invention

In view of this, embodiments of the present application provide an image processing method, an apparatus, a device, and a storage medium.

The technical scheme of the embodiment of the application is realized as follows:

in one aspect, an embodiment of the present application provides an image processing method, where the method includes:

adopting grids with various sizes to carry out segmentation processing on the collected image frames to obtain an image block set;

carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object;

splicing at least one target image block based on the position information of each target image block in the image frame to obtain an image to be analyzed containing the preset object;

and performing behavior recognition on the image to be analyzed to obtain a recognition result.

In another aspect, an embodiment of the present application provides an image processing apparatus, including:

the segmentation module is used for segmenting the acquired image frame by adopting grids of various sizes to obtain an image block set;

the first identification module is used for carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object;

the splicing module is used for splicing at least one target image block based on the position information of each target image block in the image frame to obtain an image to be analyzed containing the preset object;

and the second identification module is used for performing behavior identification on the image to be analyzed to obtain an identification result.

In another aspect, an embodiment of the present application provides a computer device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the processor executes the computer program to implement some or all of the steps in the method.

In still another aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements part or all of the steps of the above method.

In the embodiment of the application, the acquired image frames are segmented by adopting grids with various sizes to obtain an image block set; carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object; splicing at least one target image block based on the position information of each target image block in the image frame to obtain an image to be analyzed containing the preset object; and performing behavior recognition on the image to be analyzed to obtain a recognition result. Therefore, on one hand, after the collected image frame is segmented by adopting the multi-size grids, the obtained image block is subjected to object identification, and more accurate and detailed feature detection can be realized while considering the processing performance, so that the identification requirements of objects with various sizes are met, and the accuracy of behavior identification can be improved; on the other hand, the at least one target image block is spliced based on the position in the image frame, and the obtained image to be analyzed can contain more parts of the preset object or all the preset object, so that more characteristics of the preset object can be utilized in the process of behavior recognition of the image to be analyzed, and the accuracy of the behavior recognition can be further improved.

Drawings

Fig. 1 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 2 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 3 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 4 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 5 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 6 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 7 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 8 is a schematic flow chart illustrating an implementation of an image processing method according to an embodiment of the present application;

fig. 9A is a schematic view of a chemical tank car filling operation area in a semiconductor manufacturing process according to an embodiment of the present disclosure;

fig. 9B is a schematic view of a scenario in which a fully-enclosed fence is disposed outside the catch trench according to an embodiment of the present disclosure;

fig. 9C is a schematic view of a setting scenario of a fully closed fence according to an embodiment of the present application;

fig. 9D is a schematic view of a scene in which a person in an area surrounded by a fence wears a level C chemical protective suit according to an embodiment of the present application;

fig. 9E is a schematic view of a scene where an operator wears a safety belt or a safety helmet according to an embodiment of the present application;

FIG. 9F is a schematic view of a scenario that a wheel chock is provided for a driving wheel of a tank car according to an embodiment of the present application;

fig. 9G is a schematic view of a scenario where monitoring of personnel is performed in a filling operation according to an embodiment of the present application;

FIG. 9H is a block diagram illustrating an architecture of an image processing system according to an embodiment of the present disclosure;

fig. 9I is a schematic view of a scenario in which a plurality of high-definition webcams are installed in a chemical filling area according to an embodiment of the present disclosure;

fig. 9J is a schematic view of an implementation flow of recognizing a preset object in an image processing method according to an embodiment of the present application;

fig. 9K is a schematic view of an implementation flow of an image processing method according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 11 is a hardware entity diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

In order to make the purpose, technical solutions and advantages of the present application clearer, the technical solutions of the present application are further described in detail with reference to the drawings and the embodiments, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the protection scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

The following description will be added if a similar recitation of "first/second" appears in the specification, and reference is made in the following description to the term "first/second/third" merely to distinguish between similar objects and not to imply a particular ordering with respect to the objects, it being understood that "first/second/third" may, where permissible, be interchanged in a particular order or sequence to enable the embodiments of the application described herein to be practiced in other than the order illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. The computer device refers to a device with data processing capability, such as a server, a notebook computer, a tablet computer, a desktop computer, a smart television, a set-top box, a mobile device (e.g., a mobile phone, a portable video player, a personal digital assistant, a dedicated messaging device, and a portable game device). Fig. 1 is a schematic implementation flow diagram of an image processing method according to an embodiment of the present application, and as shown in fig. 1, the method includes the following steps S101 to S104:

step S101, carrying out segmentation processing on the collected image frames by adopting grids of various sizes to obtain an image block set.

Here, the image frame is an image to be subjected to behavior recognition, and may be, for example, an image frame in a video stream or an image sequence collected for a daily life scene (such as a traffic intersection, a living community, a subway line, and the like), a production work scene (such as a workshop, a construction site, a high-risk work site, and the like), and the like. In practice, those skilled in the art may acquire suitable image frames according to actual requirements, and the present invention is not limited thereto.

The meshes of various sizes may be preset, or may be dynamically determined according to the size, resolution, and the like of the image frame, which is not limited herein.

The image frame is divided into a plurality of image blocks, and the image block set comprises at least one divided image block. In implementation, the segmentation process may be performed multiple times by using meshes of different sizes based on the acquired image frames, and the implementation manner of the segmentation process is not limited herein.

In some embodiments, a first division process may be performed on an acquired image frame by using a mesh of a first size to obtain a plurality of image blocks, and then a second division process may be performed on the image blocks obtained by the first division process by using a mesh of a second size, and the image blocks are sequentially divided in this way until all meshes of various sizes are divided to obtain an image block set. In implementation, the mesh may be sequentially selected in descending order of size to perform the segmentation process, or the mesh of various sizes may be used to perform the segmentation process a plurality of times in a random order, which is not limited herein. In some embodiments, after each division process, the obtained plurality of image blocks may be identified and screened to obtain at least one image block satisfying a set condition, and only the at least one image block satisfying the set condition is divided when the division process is performed next time.

In some embodiments, meshes of multiple sizes may be used to separately perform segmentation processing on the acquired image frames, and an image block set is obtained based on a plurality of image blocks obtained after segmentation of the meshes of each size. In some embodiments, a plurality of image blocks obtained after the grid division of each size may be identified and screened to obtain at least one image block satisfying a set condition, and the at least one image block satisfying the set condition may be added to the image block set.

Step S102, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object.

Here, the preset object is a preset object containing behavior recognition related features, and a person skilled in the art may set an appropriate preset object according to actual situations, which is not limited herein. For example, in the case of identifying abnormal operation behaviors in a loading and unloading operation of hazardous chemicals, the preset objects may include any suitable objects in which an violation condition may occur on the operation site, including but not limited to one or more of personnel, chemical protective clothing, a safety helmet, a safety belt, a vehicle, an isolation belt, and the like. The target image block at least includes a portion of the preset object, that is, the target image block includes the whole or part of the preset object. For example, in the case that the preset object includes a person, the target image block may be an image block including a complete human body, or may be an image block including a part of a human body (such as a head, a shoulder, an arm, or a lower leg); in the case that the predetermined object includes chemical protective clothing, the target image block may be an image block including the whole chemical protective clothing, or may be an image block including a part of the chemical protective clothing (such as a sleeve, a collar, a zipper, a pocket, or a trouser leg).

By performing object recognition on the image blocks in the image block set, whether the recognized image blocks contain part or whole of the preset object or not can be determined, so that at least one target image block can be determined from the image block set to obtain a target image block set. In practice, any suitable image recognition algorithm may be used to perform object recognition on the image block, and is not limited herein. For example, a target detection algorithm may be used to detect a preset object in an image block to determine whether the image block includes a part or an entirety of the preset object, or a classification algorithm may be used to classify the image block and determine whether the image block is a target image block according to a classification result.

Step S103, based on the position information of each target image block in the image frame, splicing at least one target image block to obtain an image to be analyzed containing the preset object.

Here, since the target image blocks are divided from the captured image frame, each target image block has corresponding position information in the image frame. In practice, the position information corresponding to each image block may be determined during the segmentation process.

The position information of the image block in the image frame may be any suitable information that may characterize the position of the image block in the image frame. In practice, those skilled in the art may use appropriate position information to describe the position of the image block in the image frame according to practical situations, and the description is not limited herein. For example, the position information of the image block in the image frame may be coordinate information of four corners of the image block in the image frame, or may be coordinate information of a center point of the image block.

Based on the position information of each target image block in the image frame, at least one target image block can be spliced, so that adjacent target image blocks can be spliced together to obtain at least one image to be analyzed. In this way, a plurality of parts containing the same preset object can be spliced together, and target image blocks positioned adjacent to each other in the image frame are spliced together, so that the obtained image to be analyzed can contain more parts of the preset object or all the preset object. For example, in the case that the preset object is a person, a plurality of target image blocks including a human body part may be stitched based on the position information in the image frame to obtain an image to be analyzed including a more complete person.

And step S104, performing behavior recognition on the image to be analyzed to obtain a recognition result.

Here, the algorithm for performing behavior recognition on the image to be analyzed and the recognition result of the algorithm may be determined according to an actual application scenario, which is not limited herein. For example, for an abnormal operation behavior monitoring scenario, any suitable behavior recognition algorithm may be used to recognize an abnormal operation behavior in an image to be analyzed, and the obtained recognition result may include, but is not limited to, whether an abnormal operation behavior occurs in the current operation scenario, the type of the abnormal operation behavior, and the like. For another example, in a scene where the dangerous traffic behavior at the intersection is monitored, any suitable behavior recognition algorithm may be used to monitor the dangerous traffic behavior in the image to be analyzed, and the obtained recognition result may include, but is not limited to, whether the dangerous traffic behavior occurs in the current traffic scene, the type of the dangerous traffic behavior, the license plate number of the involved vehicle, and the like.

In some embodiments, a pre-trained behavior recognition model may be used to perform behavior recognition on an image to be analyzed, so as to obtain a recognition result.

In the embodiment of the application, the collected image frames are segmented by adopting grids with various sizes to obtain an image block set; carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object; splicing at least one target image block based on the position information of each target image block in the image frame to obtain an image to be analyzed containing the preset object; and performing behavior recognition on the image to be analyzed to obtain a recognition result. Therefore, on one hand, after the collected image frame is segmented by adopting the multi-size grids, the obtained image block is subjected to object identification, and more accurate and detailed feature detection can be realized while considering the processing performance, so that the identification requirements of objects with various sizes are met, and the accuracy of behavior identification can be improved; on the other hand, the at least one target image block is spliced based on the position in the image frame, and the obtained image to be analyzed can contain more parts of the preset object or all the preset object, so that more characteristics of the preset object can be utilized in the process of performing behavior recognition on the image to be analyzed, and the accuracy of the behavior recognition can be further improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 2, the method includes:

step S201, acquiring an image frame of the acquired region to be identified.

Here, the region to be recognized may be a region where behavior recognition is required. In practice, the area to be identified may be a work area of a preset work. For example, in the case where the preset operation is a chemical filling operation, the region to be identified may be a chemical filling region; in the case where the preset operation is a chemical unloading operation, the area to be identified may be a chemical unloading area; in the case where the predetermined operation is a workshop production operation, the area to be identified may be a production workshop.

The embodiment of the present application is not limited to the manner of acquiring the image frame of the acquired region to be identified. In some embodiments, the computer device may perform image acquisition on the region to be identified through the image acquisition module to obtain an image frame of the region to be identified. In some embodiments, at least one camera may be disposed in the area to be identified, and after the camera acquires the image frame of the area to be identified, the image frame may be transmitted to the computer device. In some embodiments, the computer device may also obtain stored image frames of the area to be identified from the internet or a database or the like.

Step S202, determining whether a preset job exists in the image frame.

Here, the preset job is a preset job process to be subjected to behavior recognition, and a person skilled in the art may determine the preset job according to an actual scene, which is not limited herein. For example, the preset operation may include a chemical filling operation, a chemical unloading operation, or a shop production operation, etc.

In implementation, whether the preset job exists in the image frame may be determined in an appropriate manner according to the actual scene and the job characteristics of the preset job, which is not limited herein. For example, for a chemical filling operation, whether a chemical filling operation is present in the image frame may be determined by identifying whether a transport feature is present in the image frame, in which case a chemical filling operation is determined to be present in the image frame, and in which case a transport feature is determined to be absent in the image frame, a chemical filling operation is determined to be absent in the image frame. For another example, for a shop production job, it may be determined whether the shop production job exists in the image frame by identifying whether a production device operation feature exists in the image frame, in a case where it is determined that the production device operation feature exists in the image frame, it may be determined that the shop production job exists in the image frame, and in a case where it is determined that the production device operation feature does not exist in the image frame, it may be determined that the shop production job does not exist in the image frame. For another example, in the process of performing the preset operation, a signboard indicating the operation is set in the to-be-recognized region, whether the preset operation exists in the image frame may be determined by recognizing whether the corresponding signboard feature exists in the image frame, in the case where it is determined that the signboard feature exists in the image frame, the preset operation may be determined to exist in the image frame, and in the case where it is determined that the signboard feature does not exist in the image frame, the preset operation may be determined to not exist in the image frame.

And step S203, under the condition that the preset operation exists in the image frame, adopting grids with various sizes to carry out segmentation processing on the collected image frame to obtain an image block set.

Step S204, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object.

Step S205, based on the position information of each target image block in the image frame, stitching at least one target image block to obtain an image to be analyzed including the preset object.

And step S206, performing behavior recognition on the image to be analyzed to obtain a recognition result.

Here, steps S204 to S206 correspond to steps S102 to S104, and the detailed implementation of steps S102 to S104 can be referred to.

In some embodiments, the step S202 may include at least one of the following steps S221 and S222:

step S221, identifying a feature of the operation device in the image frame, and determining that a preset operation exists in the image frame when a preset operation device feature exists in the image frame.

Here, the job device refers to a device required to perform a preset job. For example, for a chemical filling operation or a chemical unloading operation, the working equipment may include a transport vehicle for transporting chemicals, such as a tank car, a tank truck, or the like, and for a shop production operation, the working equipment may include production equipment for performing production of products, such as a machine tool, a cutter, or the like.

The operating equipment characteristics refer to characteristics related to operating equipment, the preset operating equipment refers to characteristics capable of representing the existence of preset operation in the image frame, and whether corresponding operation behaviors exist in the area to be identified can be determined by identifying the preset operating equipment characteristics. For example, where the work equipment includes a transport and the predetermined operation includes a chemical filling operation, the work equipment characteristics may include transport-related characteristics that may be indicative of the presence of the chemical filling operation in the image frames due to the presence of the transport in the image frames, and thus, the predetermined work equipment characteristics may include structural characteristics, dimensions, colors, motion profiles, etc. of the transport. For another example, in the case that the operation device includes a production device and the preset operation includes a workshop production operation, the operation device characteristic may include a characteristic related to the production device, and since the production device in the image frame is in the running state, it may be characterized that the workshop production operation exists in the image frame, and therefore, the preset operation device characteristic may include a running indicator of the production device, a generation frequency of a product, and the like, which may characterize that the production device is in the running state.

In practice, any suitable recognition algorithm may be used to recognize the characteristics of the working device in the image frame and determine whether the preset working device characteristics exist in the image frame, which is not limited herein.

Step S222, identifying the marking features in the image frame, and determining that a preset job exists in the image frame when a preset marking feature exists in the image frame.

Here, the marking feature refers to a related feature of a marking board, a marking banner, an electronic screen, or the like for marking. The preset indicating feature may be any suitable indicating feature capable of indicating that a preset operation is started or is being performed, and is not limited herein. For example, before starting the preset operation, an indicator (such as a signboard, an indicator banner, or the like) indicating that the preset operation is about to be performed or is being performed may be set in the area to be identified, and the preset indicator feature may be a feature indicating that the indicator is present in the image frame.

In practice, any suitable recognition algorithm may be used to recognize the landmark features in the image frame and determine whether the preset landmark features exist in the image frame, which is not limited herein.

In some embodiments, the step S221 may include the following steps S221a to S221d:

step S221a, acquiring a video stream, and capturing an image frame sequence including a plurality of image frames from the video stream.

Here, the video stream may be acquired by an image acquisition module of the computer device in the area to be identified, may also be transmitted to the computer device after being acquired by a camera arranged in the area to be identified, and may also be downloaded from the internet or read from a database by the computer device, which is not limited herein. In practice, the skilled person can obtain the video stream in a suitable manner according to the actual situation.

The image frame sequence may include a plurality of image frames of the region to be identified, and the plurality of image frames may be consecutive image frames in the video stream, or a plurality of image frames with a specific frame interval, which is not limited herein.

And step S221b, identifying the characteristics of the operation equipment in the image frame sequence.

Here, the operating device feature may be recognized in a plurality of image frames of the area to be recognized in the image frame sequence to determine whether there is an image frame including a preset operating device feature in the image frame sequence.

Step S221c, in a case that it is determined that an image frame including a preset operation device feature exists in the image frame sequence, determining a duration of existence of the preset operation device feature in the video stream.

Here, in a case where it is determined that an image frame including a preset work device feature exists in the image frame sequence, a length of time during which the preset work device feature exists in the video stream may be determined by recognizing the work device feature in a plurality of image frames before and after the image frame in the video stream.

Step S221d, determining that a preset job exists in the image frame sequence when the existence duration exceeds a time threshold.

Here, the time threshold may be set by a user, may be a default of the system, and may be dynamically determined according to a type of a preset job, which is not limited herein. For example, an expected operation time period of the preset operation may be determined according to the type of the preset operation, and a time threshold value proportional to the expected operation time period may be determined based on the expected operation time period.

In the embodiment of the application, under the condition that preset operation exists in the collected image frames of the to-be-identified area, the collected image frames are segmented by adopting grids of various sizes to obtain an image block set. Therefore, only behaviors in the process of the preset operation can be identified, so that unnecessary workload can be reduced, and the utilization rate of computing resources can be improved. In some embodiments, the presence of the preset job in the image frame may be determined simply and quickly by identifying the presence of the preset job device feature and/or the preset logo feature in the image frame. In some embodiments, an image frame sequence including a plurality of image frames may be captured from a video stream, and when it is determined that an image frame including a preset operation device feature exists in the image frame sequence and the existence duration of the preset operation device feature in the video stream exceeds a time threshold, it is determined that a preset operation exists in the image frame sequence, so that the accuracy and the discovery rate of the preset operation identification may be improved, the accuracy and the discovery rate of the behavior identification may be improved, and the utilization rate of the computing resource may be further improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 3, the method includes:

step S301, sequentially adopting an ith grid to carry out segmentation processing on at least one (i-1) th target image block according to the sequence of grid sizes from large to small to obtain a plurality of ith image blocks, and carrying out object identification on each ith image block to obtain at least one ith target image block at least comprising a part of a preset object; wherein i is a positive integer smaller than N, the 0 th target image block is an acquired image frame, and each i-th target image block at least comprises a part of the preset object;

here, N is an integer greater than 1, and the acquired image frame may be divided by using N grids of decreasing sizes to obtain an image block set. The ith grid is the ith grid in the N grids with the sizes from large to small.

Step S302, under the condition that i is determined to be equal to N-1, at least one N-1 target image block is subjected to division processing by adopting an Nth grid to obtain a plurality of Nth image blocks.

Step S303, determining the image block set based on a plurality of the nth image blocks.

Here, the plurality of nth image blocks may be directly added to the image block set, or a predetermined number of nth image blocks may be selected from the plurality of nth image blocks and added to the image block set according to actual conditions, which is not limited herein.

Step S304, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object.

Step S305, based on the position information of each target image block in the image frame, stitching at least one target image block to obtain an image to be analyzed including the preset object.

And S306, performing behavior recognition on the image to be analyzed to obtain a recognition result.

Here, steps S304 to S306 correspond to steps S102 to S104, and specific embodiments of steps S102 to S104 may be referred to when the steps are performed.

In some embodiments, the N is 3, and the N different sizes of meshes include a first mesh, a second mesh, and a third mesh, wherein the size of the second mesh is smaller than the size of the first mesh, and the size of the third mesh is smaller than the size of the second mesh.

In the embodiment of the application, according to the sequence of grid sizes from large to small, the ith grid is adopted to carry out segmentation processing on at least one (i-1) th target image block in sequence to obtain a plurality of ith image blocks, and object identification is carried out on each ith image block to obtain at least one ith target image block at least comprising a part of a preset object; under the condition that i is determined to be equal to N-1, performing segmentation processing on at least one N-1 th target image block by adopting an Nth grid to obtain a plurality of Nth image blocks; based on the plurality of Nth image blocks, a set of image blocks is determined. In this way, in the process of sequentially adopting a plurality of grids with the sizes from large to small for segmentation processing, the target image block obtained after the grid with the larger size is segmented can be further segmented by adopting the grid with the smaller size, so that the more accurate and detailed object identification can be carried out on the image block after the further segmentation, and a more accurate target image block set can be obtained, and the accuracy of behavior identification can be further improved. In addition, the smaller-sized grid only further divides the target image block divided by the larger-sized grid, and then only more detailed object recognition is performed on the image block divided by the smaller-sized grid, so that the workload of object recognition can be reduced, and the image processing performance can be improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 4, the method includes:

step S401, the collected image frames are segmented by grids of various sizes to obtain an image block set.

Here, step S401 corresponds to step S101, and the detailed implementation of step S101 can be referred to.

Step S402, aiming at each image block in the image block set, a convolutional neural network is utilized to identify a preset object in the image block, and the probability that the image block contains the preset object is obtained.

Here, the convolutional neural network may be pre-trained in any suitable manner. In practice, those skilled in the art can implement the convolutional neural network by using a suitable network structure according to practical situations, and the implementation is not limited herein.

Step S403, determining at least one target image block containing the preset object based on the probability that each image block contains the preset object; each target image block in the target image block set at least comprises a part of a preset object.

Here, whether the image block includes the preset object may be determined according to a probability size of the image block including the preset object. In implementation, the image blocks including the preset object and having a probability greater than a set probability threshold may be determined as the target image blocks, or each image block in the image block set may be sorted according to the probability including the preset object from large to small, and the preset number of image blocks sorted in the front may be determined as the target image blocks. A person skilled in the art may determine at least one target image block based on the probability that each image block includes the preset object in a suitable manner according to an actual situation, which is not limited in the embodiment of the present application.

Step S404, determining a target image block set based on the at least one target image block.

Here, the at least one target image block may be directly added to the target image block set, or a set number of target image blocks may be selected from the at least one target image block and added to the target image block set according to actual conditions, which is not limited herein.

Step S405, based on the position information of each target image block in the image frame, splicing at least one target image block to obtain an image to be analyzed, wherein the image to be analyzed comprises the preset object.

And S406, performing behavior recognition on the image to be analyzed to obtain a recognition result.

Here, steps S405 to S406 correspond to steps S103 to S104, and specific embodiments of steps S103 to S104 may be referred to when implemented.

In some embodiments, the number of the preset objects is multiple, and the step S402 may include the following steps S421 to S423:

and step S421, performing convolution processing on the image block to obtain an intermediate characteristic data set.

Here, the image block may be convolved with any suitable convolution layer according to actual conditions to perform feature extraction, so as to obtain an intermediate feature data set. The intermediate characteristic data included in the intermediate characteristic data set may be determined according to the convolution layer actually used, and is not limited herein.

And step S422, carrying out data sampling and convolution processing on the intermediate characteristic data set to obtain a full-connection characteristic data set.

Any suitable fully connected layer may be used to sample and convolve the features in the intermediate feature data set, and is not limited herein.

Step S423, classifying the features in the fully connected feature data set by using an activation function, so as to obtain a probability that the image block contains different preset objects.

Here, the activation function may include, but is not limited to, one or more of a Sigmoid function, a hyperbolic tangent (Tanh) function, a Linear rectification function (Rectified Linear Unit), a softmax function, and the like. In implementation, a person skilled in the art may classify the features in the fully connected feature data set by using any suitable activation function according to an actual situation, so as to obtain the probability that the image block includes different preset objects, which is not limited herein.

In the embodiment of the application, for each image block in the image block set, a convolutional neural network is used for identifying a preset object in the image block to obtain the probability that the image block contains the preset object, then at least one target image block containing the preset object is determined based on the probability that each image block contains the preset object, and then the target image block set is determined based on the at least one target image block. Therefore, the object identification can be simply and quickly carried out on at least one image block in the image block set to obtain the target image block set.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 5, the method includes:

step S501, a grid with various sizes is adopted to divide collected image frames to obtain an image block set.

Step S502, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object.

Here, steps S501 to S502 correspond to steps S101 to S102, and specific embodiments of steps S101 to S102 can be referred to in implementation.

Step S503, determining a type of each target image block based on a type of a preset object included in each target image block.

Here, the type of the preset object included in the target image block may be determined by performing object recognition on the image block. The type of the target image block may represent a type of a preset object included in the target image block, and a plurality of target image blocks including different types of preset objects may have the same type or different types, which is not limited herein. In some embodiments, target image blocks containing preset objects of the same or related type may be determined to be of the same type.

In implementation, a person skilled in the art may perform type division on the preset object and the target image block in a suitable manner according to actual conditions, and determine the type of each target image block based on the type of the preset object included in each target image block. For example, for an abnormal operation behavior monitoring scene, the type of the preset object may include one or more of a person, a chemical defense suit, a safety helmet, a safety belt, a transportation tool, an isolation belt, and the like, and accordingly, the target image blocks including the person, the chemical defense suit, the safety helmet, the safety belt, the transportation tool, and the isolation belt may be respectively determined to be of different types.

Step S504, based on the position information of each target image block in the image frame, at least one target image block belonging to the same type is spliced to obtain at least one image to be analyzed.

Here, based on the position information of each target image block in the image frame, at least one target image block belonging to the same type is spliced, a plurality of target image blocks adjacent to each other and of the same or related type of preset object may be spliced together, and thus each obtained image to be analyzed may include a more complete preset object of the same or related type. For example, under the condition that the types of the preset objects may include people, a transportation tool and an isolation zone, at least one target image block including the people may be spliced to obtain an image to be analyzed including more complete people, at least one target image block including the transportation tool may be spliced to obtain an image to be analyzed including more complete transportation tool, and at least one target image block including the isolation zone may be spliced to obtain an image to be analyzed including more complete isolation zone.

And step S505, performing behavior recognition on the image to be analyzed to obtain a recognition result.

Here, step S505 corresponds to step S104, and in practice, reference may be made to a specific implementation of step S104.

In the embodiment of the application, the type of each target image block is determined based on the type of a preset object contained in each target image block, and at least one target image block belonging to the same type is spliced based on the position information of each target image block in an image frame to obtain at least one image to be analyzed. Therefore, different preset objects in the image frame can be respectively identified, so that more dimensionality behavior identification can be realized, and the accuracy rate and the discovery rate of the behavior identification are further improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 6, the method includes:

step S601, carrying out segmentation processing on the collected image frame by adopting grids of various sizes to obtain an image block set.

Step S602, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object, and the preset object comprises preset transport means characteristics.

Step S603, based on the position information of each target image block in the image frame, stitching at least one target image block to obtain an image to be analyzed that includes the preset object.

Here, steps S601 to S603 correspond to steps S101 to S103, and specific embodiments of steps S101 to S103 may be referred to in the implementation.

Step S604, identifying a preset transport means feature in the image to be analyzed, and determining whether the preset transport means feature exists in the image to be analyzed.

Here, the preset transportation means may be any suitable transportation means determined according to an actual application scenario, and may include, but is not limited to, a tank car, a tank wagon, a mixer wagon, and the like. The predetermined vehicle characteristic may be any suitable characteristic that can be used to identify the predetermined vehicle, such as, but not limited to, structural characteristics, size, color, motion profile, etc. of the vehicle.

In implementation, a person skilled in the art may identify the preset vehicle feature in the image to be analyzed by using any suitable identification algorithm according to an actual situation, and determine whether the preset vehicle feature exists in the image to be analyzed, which is not limited herein.

Step S605, under the condition that the preset transport tool characteristics exist in the image to be analyzed, performing behavior recognition on the image to be analyzed to obtain a recognition result.

In some embodiments, the preset object further includes a feature of an isolation zone, the identification result includes a first identification result, the number of the images to be analyzed is at least one, and the performing behavior identification on the images to be analyzed in step S605 to obtain an identification result includes the following steps S611 to S612:

step S611, identifying the isolated zone feature in each image to be analyzed, and determining whether the isolated zone feature exists in each image to be analyzed.

Here, the isolation belt may be any suitable object that can enclose a predetermined transportation means in the area to be identified, such as a red and white road cone, a fence, and the like, which is not limited herein. The characteristics of the isolation zone may be any suitable characteristics that can be used to identify the isolation zone, and may be determined based on the object that the isolation zone actually uses, and is not limited herein.

In implementation, a person skilled in the art may identify the isolated band feature in the image to be analyzed by using any suitable identification algorithm according to an actual situation, and determine whether the isolated band feature exists in the image to be analyzed, which is not limited herein.

Step S612, generating a first identification result under the condition that the isolation zone characteristics do not exist in each image to be analyzed; and the first identification result represents that abnormal operation behaviors without isolation belts exist in the area to be identified.

In some embodiments, the recognition result further includes a second recognition result, and the performing behavior recognition on the image to be analyzed in the step S605 to obtain the recognition result further includes the following steps S621 to S622:

step S621, in a case that it is determined that the median feature exists in at least one of the images to be analyzed, identifying the median feature in the at least one of the images to be analyzed, and determining whether a target image block including the median feature in the area to be identified surrounds a target image block including the preset vehicle feature.

Here, each image to be analyzed is formed by splicing at least one target image block, and for the image to be analyzed containing the characteristics of the isolation zone, the target image block containing the characteristics of the isolation zone in the area to be analyzed can be determined by identifying the characteristics of the isolation zone in the image to be analyzed. For an image to be analyzed containing preset transport tool features, the target image block containing the preset transport tool features in the area to be analyzed can be determined by identifying the preset transport tool features in the image to be analyzed. According to the position information of each target image block containing the characteristics of the isolation belt in the image frame and the position information of each target image block containing the characteristics of the preset transport vehicle in the image frame, whether the target image block containing the characteristics of the isolation belt in the area to be identified surrounds the target image block containing the characteristics of the preset transport vehicle can be determined.

It should be noted that the image frames may be acquired by at least one camera disposed around the region to be identified, and may include an omnidirectional image of the region to be identified. In some embodiments, a picture of the area to be recognized may be acquired from a specific shooting angle through one camera, so that when the preset transportation tool and the isolation zone exist in the area to be recognized, the preset transportation tool in the acquired image frame does not shield the isolation zone, so as to reduce a situation that the isolation zone is shielded by the preset transportation tool in the acquired image frame, and further, whether a target image block containing characteristics of the isolation zone in the area to be recognized surrounds the target image block containing the characteristics of the preset transportation tool may be accurately determined. In other embodiments, the images of the area to be recognized can be acquired from different shooting angles through the multiple cameras to obtain an image frame of a panoramic image containing the area to be recognized, and whether a target image block containing the characteristics of the isolation belt in the area to be recognized surrounds a target image block containing the characteristics of the preset transport tool can be accurately judged.

Step S622, generating a second recognition result when it is determined that the target image block corresponding to the median feature in the area to be recognized does not surround the target image block corresponding to the preset vehicle feature; and the second identification result represents the abnormal operation behavior of the transportation tool with the non-totally-enclosed isolation belt in the area to be identified.

In some embodiments, the preset object further includes a person feature, the recognition result further includes a third recognition result, the behavior recognition on the image to be analyzed in step S605 is performed to obtain a recognition result, and the following steps S631 to S632 are further included:

step S631, in a case that it is determined that the isolated zone feature exists in at least one of the images to be analyzed, identifying the isolated zone feature and the person feature in the at least one of the images to be analyzed, and determining whether a person violation behavior exists in the area to be identified.

Here, the personnel characteristic may be any suitable characteristic determined according to practical situations that can be used for identifying the violation of the personnel, including but not limited to the posture, dressing, position of the human body, and the like, and is not limited herein.

The human violation may be any suitable behavior that does not comply with the human behavior specification or requirements of the area to be identified. In implementation, a person skilled in the art may determine the rule or requirement of the behavior of the person in the area to be identified according to actual conditions, and further determine an appropriate violation of the person, which is not limited herein. For example, in the case where the area to be identified is a hazardous chemical substance loading and unloading work area, it is necessary for the worker in the isolation zone to wear a specific chemical protective suit, and therefore, the human violation may include the worker located in the isolation zone not wearing the chemical protective suit. As another example, where the area to be identified is a construction site, personnel in the construction site may all need to wear a safety helmet, and thus, a personnel violation may include the personnel in the construction site not wearing a safety helmet.

Step S632, generating a third recognition result under the condition that the illegal behaviors of the personnel exist in the area to be recognized; and the third recognition result represents that the illegal behaviors of the personnel exist in the area to be recognized.

In some embodiments, the preset object further comprises a chemical defense suit feature, a safety helmet feature, a safety belt feature, and the human violation comprises at least one of: the person in the area surrounded by the isolation belt does not wear chemical protective clothing, the person in the area to be identified does not wear safety helmets, the person on the preset transportation tool does not wear safety belts, and no operation supervisor in the area to be identified.

In implementation, for illegal behaviors of people who are located in an area surrounded by an isolation zone and do not wear chemical protective clothing, characteristics of the isolation zone, characteristics of the people and characteristics of the chemical protective clothing in at least one image to be analyzed can be identified, and under the condition that it is determined that a target image block containing the characteristics of the people does not contain the characteristics of the chemical protective clothing in an area surrounded by a target image block corresponding to the characteristics of the isolation zone to be identified, illegal behaviors of people who are located in the area surrounded by the isolation zone and do not wear the chemical protective clothing are generated and characterized. Here, the chemical defense suit may be determined according to an actual application scenario of the area to be identified, and is not limited herein. For example, in the case where the area to be identified is a hazardous chemical substance handling area, the chemical defense suit may be a C-level chemical defense suit that meets the needs of hazardous chemical substance work.

For the illegal behaviors of people who are located in the area to be identified and do not wear safety helmets, the characteristics of the safety helmets and the characteristics of the people in at least one image to be analyzed can be identified, and under the condition that the target image block containing the head characteristics of the people in the area to be identified does not contain the characteristics of the safety helmets, the illegal behaviors of people who are located in the area to be identified and do not wear safety helmets are generated and characterized. Here, the helmet feature may be any suitable feature that can be used to identify the helmet, and the embodiment of the present application is not limited thereto.

For the illegal behaviors of people who are located on the preset transport means and do not wear the safety belt, the safety belt characteristics, the personnel characteristics and the preset transport means characteristics in at least one image to be analyzed can be identified, and under the condition that the safety belt characteristics are not contained in the target image block containing the personnel characteristics and connected with the target image block containing the preset transport means top characteristics in the area to be identified, the illegal behaviors of people who are located on the preset transport means and do not wear the safety belt in the area to be identified are generated and represented. Here, the seat belt feature may be any suitable feature that can be used for identifying the seat belt, and the embodiment of the present application is not limited thereto.

And for the personnel violation of the non-operation supervision personnel in the area to be identified, identifying the personnel characteristics and the supervision work clothes characteristics in at least one image to be analyzed, and generating the personnel violation representing that the non-operation supervision personnel in the area to be identified exists in the area to be identified under the condition that the target image blocks containing the personnel characteristics in the area to be identified do not contain the supervision work clothes characteristics. Here, the supervisor is a person who supervises the operation behavior of the area to be recognized, and the supervisor in the area to be recognized may wear a specific supervision work clothes. The regulatory coverall features may be any suitable features that may be used to identify the regulatory coverall, such as the color, style, etc. of the regulatory coverall.

In some embodiments, the preset object further includes a driving wheel feature and a wheel block feature, the recognition result includes a fourth recognition result, and the performing behavior recognition on the image to be analyzed in the step S605 to obtain the recognition result includes the following steps S641 to S642:

step S641 is to identify the driving wheel feature and the wheel block feature in each image to be analyzed, and determine whether the target image block including the driving wheel feature is connected to the target image block including the wheel block feature.

Here, the drive wheel characteristic may be any suitable characteristic for identifying a drive wheel of a vehicle. Since the drive wheel is typically a front wheel of the vehicle, in some embodiments, the drive wheel feature may be a feature of the front wheel of the vehicle, such as a location feature of the drive wheel, or the like.

The wheel-rail feature may be any suitable feature for identifying a wheel rail, such as a shape, material, etc. of a wheel rail.

In implementation, any suitable identification algorithm may be used to identify the driving wheel feature and the wheel block feature in each image to be analyzed, and determine whether the target image block containing the driving wheel feature is connected to the target image block containing the wheel block feature, which is not limited in this embodiment.

Step S642, generating a fourth identification result under the condition that the target image block containing the driving wheel characteristics is determined not to be connected with the target image block containing the wheel block characteristics; and the fourth identification result represents that abnormal operation behaviors of the driving wheels without wheel blocks exist in the area to be identified.

In the embodiment of the application, the preset transport means characteristics in the image to be analyzed are identified, whether the preset transport means characteristics exist in the image to be analyzed is determined, and under the condition that the preset transport means characteristics exist in the image to be analyzed, behavior identification is carried out on the image to be analyzed, and an identification result is obtained. In this way, only the behavior of the preset transportation tool in the loading and unloading operation process can be identified, and therefore the utilization rate of computing resources can be improved. In some embodiments, the identification result may include a first identification result representing that an abnormal operation behavior without a median exists in the area to be identified, a second identification result representing that an abnormal operation behavior without a completely closed transportation vehicle with a median exists in the area to be identified, a third identification result representing that a person violation behavior exists in the area to be identified, and/or a fourth identification result representing that an abnormal operation behavior without a wheel block arranged on a driving wheel exists in the area to be identified, so that the abnormal operation behavior in the area to be identified may be identified from multiple dimensions, and the accuracy of identifying the abnormal operation behavior may be improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 7, the method includes:

step S701, carrying out segmentation processing on the collected image frames by adopting grids of various sizes to obtain an image block set.

Step S702, carrying out object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object, and the preset object comprises preset transport means characteristics.

Step S703, based on the position information of each target image block in the image frame, splicing at least one target image block to obtain an image to be analyzed that includes the preset object.

Here, steps S701 to S703 correspond to steps S101 to S103 described above, and specific embodiments of steps S101 to S103 may be referred to in implementation.

Step S704, identifying the preset object in the image to be analyzed to obtain the position information of the key point corresponding to the preset object in the image to be analyzed.

Here, the position information of the key points corresponding to the preset object in the image to be analyzed refers to position information of the preset object recognized in the image to be analyzed in the image frame. In implementation, since the image to be analyzed is formed by splicing target image blocks containing the preset object, and each target image block in the image to be analyzed is segmented from the image frame, the position information of each image block in the image frame can be determined in the segmentation process, so that the target image block containing the preset object in the image to be analyzed can be determined by identifying the preset object in the image to be analyzed, and further the position information of the preset object in the image to be analyzed can be determined.

Step S705, performing behavior recognition on the image to be analyzed by using the trained behavior recognition model based on the position information of the key point to obtain a recognition result.

Here, a person skilled in the art may use any suitable behavior recognition model according to actual situations, and perform behavior recognition on the image to be analyzed based on the position information of the key point, which is not limited in the embodiment of the present application.

For the training of the behavior recognition model, any suitable model training method may be adopted, which is not limited in the embodiment of the present application.

In some embodiments, before the step S705, the method further comprises:

step S711, acquiring an abnormal operation behavior sample set, wherein each sample image in the abnormal operation behavior sample set is provided with a category label and a position label, the category label is used for representing the category of the abnormal operation behavior corresponding to the sample image, and the position label is used for representing the position information of the preset object in the sample image;

here, the sample image may be any suitable image or image block capable of representing abnormal job behavior. The type of the abnormal operation behavior corresponding to the sample image refers to the abnormal operation behavior existing in the screen corresponding to the sample image, and the same sample image may correspond to one or more types of the abnormal operation behavior.

The category of the abnormal operation behavior may include, but is not limited to, one or more of an abnormal operation behavior in which the isolation belt is not provided in the area to be identified, an abnormal operation behavior in which the isolation belt is not fully enclosed in the transportation means, a person in the area surrounded by the isolation belt does not wear chemical protective clothing, a person in the area to be identified does not wear a safety helmet, a person in the preset transportation means does not wear a safety belt, a person in the area to be identified does not have an operation supervisor, an abnormal operation behavior in which the driving wheel of the preset transportation means is not provided with a wheel chock, and the like.

In implementation, the category label and the location label of each sample image in the abnormal operation behavior sample set may be labeled manually in advance or automatically by the system, which is not limited herein.

Step S712, training the behavior recognition model by using the abnormal operation behavior sample set, so as to obtain the trained behavior recognition model.

Here, a person skilled in the art may train the behavior recognition model by using the abnormal operation behavior sample set in any suitable manner according to the actual situation to obtain a trained behavior recognition model, which is not limited herein.

In the embodiment of the application, the preset object in the image to be analyzed is identified to obtain the position information of the key point corresponding to the preset object in the image to be analyzed, and the trained behavior identification model is utilized to perform behavior identification on the image to be analyzed based on the position information of the key point to obtain the identification result. Therefore, the position information of the preset object in the image frame is considered when the behavior of the image to be analyzed is identified, so that the accuracy of the behavior identification can be further improved.

Embodiments of the present application provide an image processing method, which may be executed by a processor of a computer device. As shown in fig. 8, the method includes:

step S801, carrying out segmentation processing on the collected image frames by adopting grids of various sizes to obtain an image block set.

Step S802, performing object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object.

Step S803, based on the position information of each target image block in the image frame, stitching at least one target image block to obtain an image to be analyzed that includes the preset object.

Step S804, the image to be analyzed is subjected to behavior recognition, and a recognition result is obtained.

Here, steps S801 to S804 correspond to steps S101 to S104, and the embodiments of steps S101 to S104 can be referred to.

Step S805, determining whether there is an abnormal job behavior in the area to be identified based on the identification result.

Here, whether there is an abnormal operation behavior in the area to be identified may be represented in any suitable manner in the identification result, and is not limited herein. Therefore, based on the recognition result, it can be determined whether or not there is an abnormal job behavior in the area to be recognized.

And step S806, generating and sending alarm information under the condition that the abnormal operation behavior exists in the area to be identified.

Here, the alarm information is information for alarming an abnormal operation behavior existing in the area to be identified, and may include, but is not limited to, one or more of voice alarm information, alarm indicator light information, alarm telephone, alarm mail, instant messaging software information, and the like, and is not limited herein.

In some embodiments, the generating and sending the alarm information in step S806 includes: step S811, determining the type of the abnormal operation behavior based on the recognition result; and step S812, generating and sending alarm information based on the type of the abnormal operation behavior. Here, for different types of abnormal operation behaviors, different alarm information may be generated, the same alarm information may be generated, and the alarm information may be sent in different manners or in the same manner, which is not limited herein.

In the embodiment of the application, whether the abnormal operation behaviors exist in the area to be identified is determined based on the identification result, and the alarm information is generated and sent under the condition that the abnormal operation behaviors exist in the area to be identified. Therefore, by sending the alarm information, the abnormal operation behaviors in the area to be identified can be found in time, and the found abnormal operation behaviors can be corrected in time.

The image processing method provided by the embodiment of the present application is further described below by taking a scenario of analyzing behaviors and environmental safety of suppliers in a filling operation process of a chemical tank truck in a semiconductor manufacturing process as an example.

In semiconductor manufacturing, it is often necessary to use different chemicals that are stored, transported and tank filled by specialized suppliers. For high-risk chemicals, such as tetramethylammonium hydroxide (TMAH), alcohol, liquefied gas, etc., once the operator does not perform the operation flow as required in the tank car filling process, serious safety production accidents may occur. In the related technology, the operation process of filling the tank car is supervised only by manpower, so that supervision and omission easily occur, and the filling of chemicals is difficult to effectively supervise.

In view of the above, the present disclosure provides an image processing system capable of analyzing behaviors and environmental safety of a supplier during a filling operation of a chemical tank truck in a semiconductor manufacturing process based on deep learning.

Fig. 9A is a schematic view of a scenario of a chemical tank car filling operation area in a semiconductor manufacturing process according to an embodiment of the present disclosure, as shown in fig. 9A, during a chemical tank car filling operation, an operator 11 needs to fill a tank car 12 with chemicals. In the operation process, a fully-closed fence 13 needs to be arranged around the tank car 12 to control personnel entering the operation area and isolate non-operation personnel from the operation area filled with chemicals. In addition, monitoring personnel 14 are required to monitor the entire filling operation at the chemical filling area. Both the operator 11 and the monitoring person 14 in the chemical filling area wear safety helmets and the operator 11 located on the tank car also wears safety belts.

The image processing system provided by the embodiment of the application can realize intelligent monitoring on the following 6 important violation situations in the semiconductor chemical filling process:

1) The chemical filling area is not provided with a fully-closed fence; referring to fig. 9B, a fully-enclosed fence 13 may be provided outside the catch trench 15, and the situation where the fully-enclosed fence 13 is not provided is an illegal situation.

2) The personnel who do not wear the C-level chemical protective clothing enter the area enclosed by the fence; referring to fig. 9C, the person without wearing the level C chemical protective clothing cannot cross the fence 13 into the area enclosed by the fence, and the person without wearing the level C chemical protective clothing can enter the area enclosed by the fence in an illegal manner.

3) Personnel in the area surrounded by the fence do not wear C-level chemical protective clothing; referring to fig. 9D, a person 11 in the area surrounded by the hedge needs to wear a level C chemical protective clothing, and the situation that the person in the area surrounded by the hedge does not wear the level C chemical protective clothing is an illegal situation.

4) The worker in the chemical filling area does not wear a safety helmet or the worker on the tank car does not wear a safety belt; referring to FIG. 9E, the operator 11 in the chemical filling section is required to wear a safety helmet 16, the operator 11 on the tank car 12 is also required to wear a safety belt 17, and both the case where the operator in the chemical filling section is not wearing a safety helmet and the case where the operator on the tank car is not wearing a safety belt are violations.

5) The driving wheel of the tank car is not provided with a wheel chock; here, the front wheels of the tank car are generally considered as driving wheels, and in practice, the arrangement positions of the driving wheels can be considered in combination with the road surface gradient of the chemical filling area, so as to prevent the stopped tank car from slipping forward or backward. Referring to fig. 9F, during the filling operation, the wheel chock 18 needs to be provided for the driving wheel 121 of the tank car, and the case where the wheel chock is not provided for the driving wheel of the tank car is an illegal case.

6) No personnel monitor the filling operation. Referring to fig. 9G, in the chemical filling area, the monitoring personnel 14 is required to monitor the whole filling operation, and the situation without personnel monitoring during the filling operation is an illegal situation.

In the image processing system provided by the embodiment of the application, once the violation situation of at least one of the above conditions occurs in the chemical filling area, the type of the violation situation can be automatically identified and warning information can be sent.

The embodiment of the application provides an image processing system which can analyze the behavior and environmental safety of supply personnel in the filling operation process of a chemical tank car in a semiconductor manufacturing process. Fig. 9H is a schematic diagram of a composition architecture of an image processing system according to an embodiment of the present application, and as shown in fig. 9H, the system includes a high-definition Network camera 21, a switch 22, a Network Video Recorder (NVR) 23, a system server 24, and a user terminal device 25. In implementation, the high-definition webcam 21 is configured to collect a picture of the chemical filling area to obtain a video stream of the chemical filling area; the video stream collected by the high-definition camera 21 can be transmitted to the NVR 23 and the system server 24 through the switch 22, the NVR 23 can store and backup the video stream, the system server 24 can identify the characteristics of preset objects (such as personnel, C-level chemical protective clothing, safety helmets, safety belts, fences and the like) in each image frame in the video stream by using a depth model (such as a convolutional neural network and the like), judge whether an illegal situation occurs in the chemical filling area, generate specific alarm information according to the type of the illegal situation once the illegal situation is found, and send the alarm information to the user terminal device 25 (such as an office computer, a personal computer, a mobile phone and the like) through mail/instant messaging software; the high-definition Network camera 21, the switch 22, the Network Video Recorder (NVR) 23, and the system server 24 may communicate with each other through a specific Video transmission Network 31, the system server and the user terminal device may communicate with each other through an office Network 32, and the Video transmission Network 31 and the office Network 32 may be wired networks or wireless networks, which is not limited herein.

In the image processing system provided by the embodiment of the present application, a plurality of high-definition webcams 21 may be installed in the chemical filling area, so as to perform 7 × 24 hours omni-directional monitoring on the chemical filling area from multiple angles. Fig. 9I is a schematic view of a scene where a plurality of high-definition webcams are installed in a chemical filling area according to an embodiment of the present application, and as shown in fig. 9I, a plurality of high-definition webcams 21 may be installed on both sides of a road 41, respectively, to perform all-around image acquisition on the chemical filling area 42.

The embodiment of the application provides an image processing method, which can be executed by a system server in the image processing system provided by the embodiment of the application. Referring to fig. 9J, the method may divide an image frame 51 obtained from a video stream by a grid to obtain a plurality of divided image blocks 52; the image blocks are identified by utilizing a deep convolutional neural network, the class probability corresponding to each image block (namely the probability that the image block comprises different preset objects) can be obtained, a target image block comprising the preset objects is screened out according to the class probability corresponding to each image block, the target image blocks comprising the preset objects of the same category are spliced in situ according to the positions of the target image blocks in the image frame, an image 53 to be analyzed is obtained, and the image 53 to be analyzed comprises the preset objects such as personnel, safety helmets or safety belts. The images to be analyzed obtained after splicing can be multiple (for example, different workers, fences, wheel bars and the like respectively correspond to one sub-image to be analyzed), and the judgment can be performed by combining the images to be analyzed containing different types of preset objects to determine whether violation conditions exist in the image frames. For example, images to be analyzed containing the whole person can be spliced through target image blocks containing the person, a safety helmet, a C-level chemical defense suit and the like, the behavior of the person in the images to be analyzed containing the whole person is judged by using a behavior recognition model, and whether an operator wears the safety helmet or a safety belt or not is determined; and identifying the image to be analyzed containing the fence to determine whether personnel in the area surrounded by the fence wear the C-level chemical protective clothing.

Fig. 9K is a schematic flowchart of an implementation flow of an image processing method according to an embodiment of the present application, where the method may be implemented by a system server. As shown in fig. 9K, the method includes the following steps S901 to S905:

step S901 obtains an image frame of a video stream, and divides the image frame into a plurality of image blocks through a mesh.

Here, the image frame may be divided into a plurality of image blocks of different sizes using meshes of various sizes. Based on image blocks with different sizes, more accurate and detailed target detection can be realized while considering performance, and then the identification requirements of preset objects with various sizes can be met.

In some embodiments, acquiring image frames of a video stream requires at least one of the following conditions: 1) The method comprises the steps of finding that a tank car appears in a picture of a video stream, presetting a time threshold value due to the fact that a certain time interval exists between the appearance of the tank car and the start of actual filling operation, acquiring image frames in the video stream and starting to identify violation situations under the condition that the tank car appears in the picture and the existence duration of the tank car exceeds the time threshold value. For example, when a tank car appears in a picture and the existence duration of the tank car exceeds a time threshold value, if the image frame is found to have personnel characteristics but not fence characteristics, the violation situation is determined to occur, and alarm information is sent; 2) When the operator starts the filling operation, a signboard is set, and when the signboard appears in the picture of the video stream, the operator acquires the image frame in the video stream and starts to recognize the violation situation.

Step S902, performing object identification on each image block by using a convolutional neural network to obtain a class probability corresponding to each image block.

Here, the class probability corresponding to the image block includes a probability that the image block includes different preset objects.

Step S903, screening out target image blocks containing preset objects according to the class probability corresponding to each image block, and carrying out in-situ splicing according to the positions of the target image blocks in the image frame to obtain an image to be analyzed.

Here, the multiple target image blocks including the same category may be subjected to in-situ stitching according to the category of the preset object included in the target image block and the position of the target image block in the original image, so as to obtain an image to be analyzed.

And step S904, establishing a behavior recognition model based on the convolutional neural network, and training the behavior recognition model by using an illegal training set to obtain a trained behavior recognition model.

Here, the violation training set includes a specific number of sample images, which may be any suitable images or image blocks capable of representing abnormal job behavior. In implementation, the position information of the preset target in each sample image and the existing violation situation may be labeled in advance.

Step S905, obtaining the position information of the key point corresponding to the preset object in the image to be analyzed, and inputting the position information of the key point into the trained personnel behavior recognition model to obtain a behavior recognition result.

In some embodiments, identifying the image block by using the convolutional neural network to obtain the class probability corresponding to the image block may be implemented by the following steps S921 to S923:

and step S921, performing convolution processing on the image blocks by using the depth separable convolution to perform feature extraction, so as to form an intermediate feature data set.

And step S922, carrying out data sampling and convolution processing on the intermediate characteristic data set for multiple times to obtain a full-connection characteristic data set.

Step S923, feature calculation is carried out on the full-connection feature data set, features in the full-connection feature data set are classified by utilizing a ReLU activation function, and a class probability corresponding to the image block is obtained. In implementation, the features in the fully-connected feature data set can be classified through a pre-trained classification model to obtain the class probability of the image block.

In some embodiments, the behavior recognition model is trained by using the violation training set, so as to obtain a trained behavior recognition model, which can be implemented by the following steps S941 to S942:

step S941, obtain an illegal training set and perform illegal category classification on the sample images in the illegal training set. In implementation, the type of the violation situation corresponding to the sample image can be labeled manually, a part of the sample image, which includes the preset object, is labeled on the sample image through the rectangular frame, and the coordinate positions of the four vertex angles in the rectangular frame are extracted to confirm the position of the preset object in the sample image.

In step S942, each sample image, the category of the violation situation corresponding to each sample image, and the coordinate positions of the four vertex angles of the preset object in the rectangular frame in the sample image are input into the behavior recognition model for training, so as to obtain a trained behavior recognition model.

In some embodiments, the step S905 may include the following steps S951 to S952:

step S951, performing object recognition on the image to be analyzed by using a target detection algorithm to obtain position information of a key point corresponding to a preset object in the image to be analyzed.

Step S952 is to convert, by using the trained behavior recognition model, the position information of the key points corresponding to each preset object in the image to be analyzed into feature vectors, and classify the feature vectors to obtain a behavior recognition result.

In some embodiments, the image frame may be segmented using a first grid, a second grid, and a third grid that are successively smaller in size. Firstly, segmenting an image frame by adopting a first grid, and carrying out object recognition on each segmented image block to obtain at least one first target image block containing a preset object; dividing each first target image block by adopting a second grid, and carrying out object identification on each divided image block to obtain at least one second target image block containing a preset object; finally, dividing each second target image block by adopting a third grid, and carrying out object identification on each divided image block to obtain at least one third target image block containing a preset object; and carrying out in-situ splicing according to the position of the third target image block in the image frame to obtain an image to be analyzed. Therefore, the problem that the object recognition speed is too low due to the fact that small-size meshes are directly used for image segmentation is solved, and the problem that the object recognition accuracy is not high due to the fact that only large-size meshes are used for image segmentation can also be avoided.

In some embodiments, in the process of identifying the object in the image to be analyzed, different preset objects may be identified according to a specific identification sequence. For example, whether a tank car exists in the image to be analyzed is firstly identified, whether a fence exists in the image to be analyzed is identified again if the tank car is determined, whether an operator exists in the image to be analyzed is identified again if the fence exists, and whether personnel violation behaviors and other violation situations exist is determined by combining the position relation between the operator and the tank car and the fence.

It should be noted that, in implementation, the violation situation in the embodiment of the present application may correspond to the abnormal job behavior in the foregoing embodiment, and the violation training set may correspond to the abnormal job behavior sample set in the foregoing embodiment.

In the embodiment of the application, the behavior and the environmental safety of supply personnel in the filling operation process of the chemical tank car in the semiconductor manufacturing process are analyzed, so that chemicals in the semiconductor manufacturing process can be effectively supervised, and the safety level and the intelligent level of a semiconductor factory can be improved. In addition, a plurality of important violation situations can be identified, so that the process of the chemical tank car filling operation can be monitored from multiple dimensions, and the safety level and the intelligent level of a semiconductor factory are further improved.

Fig. 10 is a schematic diagram of a composition structure of an image processing apparatus according to an embodiment of the present application, and as shown in fig. 10, an image processing apparatus 1000 includes: a segmentation module 1010, a first identification module 1020, a concatenation module 1030, and a second identification module 1040, wherein:

a segmentation module 1010, configured to perform segmentation processing on an acquired image frame by using grids of multiple sizes to obtain an image block set;

a first identifying module 1020, configured to perform object identification on at least one image block in the image block set to obtain a target image block set; each target image block in the target image block set at least comprises a part of a preset object;

a splicing module 1030, configured to splice at least one target image block based on position information of each target image block in the image frame, so as to obtain an image to be analyzed, where the image includes the preset object;

and the second identification module 1040 is configured to perform behavior identification on the image to be analyzed to obtain an identification result.

In some embodiments, the apparatus further comprises: the first acquisition module is used for acquiring an acquired image frame of the area to be identified; the first determination module is used for determining whether a preset job exists in the image frame; the segmentation module is further configured to: and under the condition that the preset operation exists in the image frame, adopting grids of various sizes to segment the acquired image frame.

In some embodiments, the first determining module is further configured to include at least one of: identifying the characteristics of the operation equipment in the image frame, and determining that preset operation exists in the image frame under the condition that the preset operation equipment characteristics exist in the image frame; identifying the marking features in the image frame, and determining that preset operation exists in the image frame under the condition that preset marking features exist in the image frame.

In some embodiments, the first determining module is further configured to: acquiring a video stream, and intercepting an image frame sequence containing a plurality of image frames from the video stream; identifying a job device feature in the sequence of image frames; determining the existence duration of a preset operation equipment characteristic in the video stream under the condition that an image frame containing the preset operation equipment characteristic exists in the image frame sequence; and under the condition that the existence duration exceeds a time threshold, determining that a preset job exists in the image frame sequence.

In some embodiments, the meshes of the plurality of sizes include meshes of N different sizes, where N is an integer greater than 1, and the segmentation module is further configured to: sequentially adopting an ith grid to carry out segmentation processing on at least one ith-1 target image block according to the sequence of grid sizes from large to small to obtain a plurality of ith image blocks, and carrying out object identification on each ith image block to obtain at least one ith target image block at least comprising the part of the preset object; wherein i is a positive integer smaller than N, the 0 th target image block is an acquired image frame, and each i-th target image block at least comprises a part of the preset object; under the condition that i is determined to be equal to N-1, performing segmentation processing on at least one (N-1) th target image block by adopting an Nth grid to obtain a plurality of Nth image blocks; determining the set of image blocks based on a plurality of the Nth image blocks.

In some embodiments, the first identification module is further configured to: for each image block in the image block set, identifying a preset object in the image block by using a convolutional neural network to obtain the probability that the preset object is contained in the image block; determining at least one target image block containing the preset object based on the probability that each image block contains the preset object; based on the at least one target image block, a target set of image blocks is determined.

In some embodiments, the number of the preset objects is multiple, and the first identification module is further configured to: performing convolution processing on the image block to obtain an intermediate characteristic data set; carrying out data sampling and convolution processing on the intermediate characteristic data set to obtain a fully-connected characteristic data set; and classifying the features in the fully connected feature data set by using an activation function to obtain the probability that the image block contains different preset objects.

In some embodiments, the stitching module is further configured to: determining the type of each target image block based on the type of a preset object contained in each target image block; and splicing at least one target image block belonging to the same type based on the position information of each target image block in the image frame to obtain at least one image to be analyzed.

In some embodiments, the predetermined object includes a predetermined vehicle characteristic, and the second identification module is further configured to: identifying preset transport means characteristics in the image to be analyzed, and determining whether the preset transport means characteristics exist in the image to be analyzed; and under the condition that the preset transport tool characteristics exist in the image to be analyzed, performing behavior recognition on the image to be analyzed to obtain a recognition result.

In some embodiments, the preset object further includes a median feature, the identification result includes a first identification result, the number of the images to be analyzed is at least one, and the second identification module is further configured to: identifying the characteristics of an isolation zone in each image to be analyzed, and determining whether the characteristics of the isolation zone exist in each image to be analyzed; under the condition that the isolation zone characteristics do not exist in each image to be analyzed, generating a first identification result; and the first identification result represents that abnormal operation behaviors without isolation belts exist in the area to be identified.

In some embodiments, the recognition result further comprises a second recognition result, and the second recognition module is further configured to: under the condition that the characteristic of the isolation zone exists in at least one image to be analyzed, identifying the characteristic of the isolation zone in the at least one image to be analyzed, and determining whether a target image block containing the characteristic of the isolation zone in the area to be identified surrounds a target image block containing the characteristic of the preset transport tool; generating a second identification result under the condition that the target image block corresponding to the characteristics of the isolation zone in the area to be identified does not surround the target image block corresponding to the characteristics of the preset transport means; and the second identification result represents the abnormal operation behavior of the transportation tool with the non-totally-enclosed isolation belt in the area to be identified.

In some embodiments, the preset object further includes a person feature, the recognition result further includes a third recognition result, and the second recognition module is further configured to: under the condition that the isolation zone features exist in at least one image to be analyzed, identifying the isolation zone features and personnel features in the at least one image to be analyzed, and determining whether personnel violation behaviors exist in the area to be identified; generating a third recognition result under the condition that the personnel violation behaviors exist in the area to be recognized; and the third recognition result represents that the illegal behaviors of the personnel exist in the area to be recognized.

In some embodiments, the preset object further includes a driving wheel feature and a wheel block feature, the identification result includes a fourth identification result, and the second identification module is further configured to: identifying the driving wheel characteristics and the wheel block characteristics in each image to be analyzed, and determining whether a target image block containing the driving wheel characteristics is connected with a target image block containing the wheel block characteristics; generating a fourth identification result under the condition that the target image block containing the driving wheel characteristic is determined not to be connected with the target image block containing the wheel block characteristic; and the fourth identification result represents that abnormal operation behaviors of the driving wheel without a wheel block exist in the area to be identified.

In some embodiments, the second identification module is further configured to: identifying the preset object in the image to be analyzed to obtain the position information of the key point corresponding to the preset object in the image to be analyzed; and performing behavior recognition on the image to be analyzed by using the trained behavior recognition model based on the position information of the key point to obtain a recognition result.

In some embodiments, the apparatus further comprises: the second acquisition module is used for acquiring an abnormal operation behavior sample set, each sample image in the abnormal operation behavior sample set is provided with a category label and a position label, the category label is used for representing the category of the abnormal operation behavior corresponding to the sample image, and the position label is used for representing the position information of the preset object in the sample image; and the training module is used for training the behavior recognition model by utilizing the abnormal operation behavior sample set to obtain the trained behavior recognition model.

In some embodiments, the apparatus further comprises: the second determining module is used for determining whether abnormal operation behaviors exist in the area to be identified or not based on the identification result; and the sending module is used for generating and sending alarm information under the condition that the abnormal operation behaviors exist in the area to be identified.

In some embodiments, the sending module is further configured to: determining the type of the abnormal operation behavior based on the identification result; and generating and sending alarm information based on the type of the abnormal operation behavior.

The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

In the embodiment of the present application, if the image processing method is implemented in the form of a software functional module and sold or used as a standalone product, the image processing method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or a part contributing to the related art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

Correspondingly, an embodiment of the present application provides a computer device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the processor implements the steps in the method when executing the program.

Correspondingly, the embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program realizes the steps of the above method when being executed by a processor.

Correspondingly, the embodiment of the present application provides a computer program product, which includes a non-transitory computer readable storage medium storing a computer program, and when the computer program is read and executed by a computer, the computer program implements part or all of the steps of the above method. The computer program product may be embodied in hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Here, it should be noted that: the above description of the storage medium, the computer program product and the device embodiments is similar to the description of the method embodiments described above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the storage medium, the computer program product and the device of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.

It should be noted that fig. 11 is a schematic diagram of a hardware entity of a computer device in an embodiment of the present application, and as shown in fig. 11, the hardware entity of the computer device 1100 includes: a processor 1101, a communication interface 1102, and a memory 1103, wherein:

the processor 1101 generally controls the overall operation of the computer device 1100.

The communication interface 1102 may enable the computer device to communicate with other terminals or servers via a network.

The Memory 1103 is configured to store instructions and applications executable by the processor 1101, and may also buffer data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by the processor 1101 and modules in the computer device 1100, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM).

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or apparatus comprising the element.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps of implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer-readable storage medium, and when executed, executes the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated unit described above may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.

The above description is only for the embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An image processing method, characterized in that the method comprises:

2. The method of claim 1, further comprising:

acquiring an acquired image frame of a region to be identified;

determining whether a preset job exists in the image frame;

the method for segmenting the acquired image frame by adopting the grids with various sizes comprises the following steps:

and under the condition that the preset operation exists in the image frame, adopting grids of various sizes to segment the acquired image frame.

3. The method of claim 2, wherein the determining whether a preset job exists in the image frame comprises at least one of:

identifying the characteristics of the operation equipment in the image frame, and determining that the preset operation exists in the image frame under the condition that the preset operation equipment characteristics exist in the image frame;

identifying the marking features in the image frame, and determining that preset operation exists in the image frame under the condition that preset marking features exist in the image frame.

4. The method of claim 3, wherein the identifying a work device feature in the image frame, determining that a preset job is present in the image frame if a preset work device feature is present in the image frame, comprises:

acquiring a video stream, and intercepting an image frame sequence containing a plurality of image frames from the video stream;

identifying a work device feature in the sequence of image frames;

determining the existence duration of a preset operation equipment characteristic in the video stream under the condition that an image frame containing the preset operation equipment characteristic exists in the image frame sequence;

and determining that a preset job exists in the image frame sequence under the condition that the existence duration exceeds a time threshold.

5. The method according to any one of claims 1 to 4, wherein the meshes of the plurality of sizes include meshes of N different sizes, where N is an integer greater than 1, and the segmenting the acquired image frames by using the meshes of the plurality of sizes to obtain the image block set includes:

sequentially adopting an ith grid to carry out segmentation processing on at least one (i-1) th target image block according to the sequence of grid sizes from large to small to obtain a plurality of ith image blocks, and carrying out object identification on each ith image block to obtain at least one ith target image block at least comprising a part of a preset object; wherein i is a positive integer smaller than N, the 0 th target image block is an acquired image frame, and each i-th target image block at least comprises a part of the preset object;

under the condition that i is determined to be equal to N-1, performing segmentation processing on at least one N-1 th target image block by adopting an Nth grid to obtain a plurality of Nth image blocks;

determining the image block set based on a plurality of the Nth image blocks.

6. The method of claim 5, wherein N is 3, and wherein the N different sizes of meshes comprise a first mesh, a second mesh, and a third mesh, wherein the second mesh is smaller in size than the first mesh, and wherein the third mesh is smaller in size than the second mesh.

7. The method according to any one of claims 1 to 4, wherein the performing object recognition on at least one image block in the image block set to obtain a target image block set comprises:

aiming at each image block in the image block set, utilizing a convolutional neural network to identify a preset object in the image block to obtain the probability that the image block contains the preset object;

determining at least one target image block containing the preset object based on the probability that each image block contains the preset object;

based on the at least one target image block, a target set of image blocks is determined.

8. The method according to claim 7, wherein the number of the preset objects is multiple, and the identifying the preset objects in the image block by using the convolutional neural network to obtain the probability that the image block includes the preset objects comprises:

performing convolution processing on the image block to obtain an intermediate characteristic data set;

carrying out data sampling and convolution processing on the intermediate characteristic data set to obtain a fully-connected characteristic data set;

and classifying the features in the fully connected feature data set by using an activation function to obtain the probability that the image block contains different preset objects.

9. The method according to any one of claims 1 to 4, wherein the stitching at least one of the target image blocks based on the position information of each of the target image blocks in the image frame to obtain an image to be analyzed including the preset object comprises:

determining the type of each target image block based on the type of a preset object contained in each target image block;

and splicing at least one target image block belonging to the same type based on the position information of each target image block in the image frame to obtain at least one image to be analyzed.

10. The method according to any one of claims 1 to 4, wherein the preset object comprises a preset transport feature, and the performing behavior recognition on the image to be analyzed to obtain a recognition result comprises:

identifying preset transport means characteristics in the image to be analyzed, and determining whether the preset transport means characteristics exist in the image to be analyzed;

and under the condition that the preset transport tool characteristics exist in the image to be analyzed, performing behavior recognition on the image to be analyzed to obtain a recognition result.

11. The method according to claim 10, wherein the preset object further includes a median feature, the recognition result includes a first recognition result, the number of the images to be analyzed is at least one, and performing behavior recognition on the images to be analyzed to obtain a recognition result includes:

identifying the characteristics of an isolation zone in each image to be analyzed, and determining whether the characteristics of the isolation zone exist in each image to be analyzed;

under the condition that the isolation zone characteristics do not exist in each image to be analyzed, generating a first recognition result; and the first identification result represents that abnormal operation behaviors without isolation zones exist in the area to be identified.

12. The method of claim 11, wherein the recognition result further comprises a second recognition result, and performing behavior recognition on the image to be analyzed to obtain a recognition result further comprises:

under the condition that the characteristic of the isolation zone exists in at least one image to be analyzed, identifying the characteristic of the isolation zone in the at least one image to be analyzed, and determining whether a target image block containing the characteristic of the isolation zone in the area to be identified surrounds a target image block containing the characteristic of the preset transport tool;

generating a second identification result under the condition that the target image block corresponding to the characteristics of the isolation zone in the area to be identified does not surround the target image block corresponding to the characteristics of the preset transport means; and the second identification result represents the abnormal operation behavior of the transportation tool with the totally-closed isolation belt in the area to be identified.

13. The method according to claim 11, wherein the preset object further includes a person feature, the recognition result further includes a third recognition result, and the performing behavior recognition on the image to be analyzed to obtain a recognition result further includes:

under the condition that the isolation zone features exist in at least one image to be analyzed, identifying the isolation zone features and personnel features in the at least one image to be analyzed, and determining whether personnel violation behaviors exist in the area to be identified;

generating a third recognition result under the condition that the personnel violation behaviors exist in the area to be recognized; and the third recognition result represents that the illegal behaviors of the personnel exist in the area to be recognized.

14. The method of claim 13, wherein the preset objects further comprise a chemical defense feature, a safety helmet feature, a safety belt feature, and wherein the human violation comprises at least one of:

people in the area surrounded by the isolation belt do not wear chemical protective clothing,

A person located in the area to be identified does not wear a safety helmet,

The person on the preset transportation tool does not wear a safety belt,

And no operation supervision personnel are arranged in the area to be identified.

15. The method according to claim 10, wherein the preset object further includes a driving wheel feature and a wheel block feature, the recognition result includes a fourth recognition result, and the performing behavior recognition on the image to be analyzed to obtain a recognition result further includes:

identifying the driving wheel characteristics and the wheel block characteristics in each image to be analyzed, and determining whether a target image block containing the driving wheel characteristics is connected with a target image block containing the wheel block characteristics;

generating a fourth identification result under the condition that the target image block containing the driving wheel characteristic is determined not to be connected with the target image block containing the wheel block characteristic; and the fourth identification result represents that abnormal operation behaviors of the driving wheels without wheel blocks exist in the area to be identified.

16. The method according to any one of claims 1 to 4, wherein the performing behavior recognition on the image to be analyzed to obtain a recognition result comprises:

identifying the preset object in the image to be analyzed to obtain the position information of the key point corresponding to the preset object in the image to be analyzed;

and performing behavior recognition on the image to be analyzed based on the position information of the key point by using the trained behavior recognition model to obtain a recognition result.

17. The method according to claim 16, wherein before the performing behavior recognition on the image to be analyzed based on the position information of the key points by using the trained behavior recognition model to obtain a recognition result, the method further comprises:

acquiring an abnormal operation behavior sample set, wherein each sample image in the abnormal operation behavior sample set is provided with a category label and a position label, the category label is used for representing the category of the abnormal operation behavior corresponding to the sample image, and the position label is used for representing the position information of the preset object in the sample image;

and training the behavior recognition model by using the abnormal operation behavior sample set to obtain the trained behavior recognition model.

18. The method according to any one of claims 1 to 4, further comprising:

determining whether abnormal operation behaviors exist in the area to be identified or not based on the identification result;

and generating and sending alarm information under the condition that the abnormal operation behaviors exist in the area to be identified.

19. The method of claim 18, wherein generating and sending alarm information comprises:

determining the type of the abnormal operation behavior based on the identification result;

and generating and sending alarm information based on the type of the abnormal operation behavior.

20. An image processing apparatus characterized by comprising:

21. A computer device comprising a memory and a processor, the memory storing a computer program operable on the processor, the processor implementing the steps of the method of any one of claims 1 to 19 when executing the program.

22. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 19.