CN114241363A - Process identification method, process identification device, electronic device, and storage medium - Google Patents

Process identification method, process identification device, electronic device, and storage medium Download PDF

Info

Publication number
CN114241363A
CN114241363A CN202111436226.2A CN202111436226A CN114241363A CN 114241363 A CN114241363 A CN 114241363A CN 202111436226 A CN202111436226 A CN 202111436226A CN 114241363 A CN114241363 A CN 114241363A
Authority
CN
China
Prior art keywords
image
image frame
sample
process identification
state image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111436226.2A
Other languages
Chinese (zh)
Inventor
张梓良
沈飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengjing Intelligent Technology Jiaxing Co ltd
Original Assignee
Shengjing Intelligent Technology Jiaxing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengjing Intelligent Technology Jiaxing Co ltd filed Critical Shengjing Intelligent Technology Jiaxing Co ltd
Priority to CN202111436226.2A priority Critical patent/CN114241363A/en
Publication of CN114241363A publication Critical patent/CN114241363A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a process identification method, a device, an electronic device and a storage medium, wherein the method comprises the following steps: determining an image frame to be identified from an image frame set; extracting paired image frames from the target image frame set according to a preset time interval, and combining each paired image frame and an image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set; and splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into the process identification model, and obtaining a process identification result output by the process identification model. The method can accurately acquire the key characteristic changes before and after the process with unfixed duration and large fluctuation, and accurately acquire the process identification result.

Description

Process identification method, process identification device, electronic device, and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying a process, an electronic device, and a storage medium.
Background
The process and the working procedures are basic constituent units of a modern industrial production link, and in the whole manufacturing link, from the production, welding, grinding, polishing and electroplating of parts to the molding, assembly, debugging and the like of large-scale finished products, the production period of a product necessarily comprises dozens of or even hundreds of working procedures. With the development of artificial intelligence application, intelligent identification and real-time monitoring of production processes not only greatly help process analysis duration and equipment energy consumption of process personnel, but also actively promote the standardized manufacturing link of products.
At present, two types of visual algorithms are mostly used for identifying processes, one is a method for detecting key features in an image or extracting the key features by using other methods and then classifying the key features, and the other is a method for identifying processes by using a video classification or detection method based on a video stream. However, the method is suitable for recognizing and analyzing events with short duration and small time fluctuation, such as a video recognition scheme for human action gestures, an automatic marking scheme for movie and television film titles and trailers, and the like, but is low in accuracy when facing the process recognition problems of large duration change and complex visual flow.
Disclosure of Invention
The invention provides a process identification method, a process identification device, electronic equipment and a storage medium, which are used for overcoming the defect of low process identification precision in the prior art.
The invention provides a process identification method, which comprises the following steps:
determining an image frame to be identified from an image frame set;
extracting paired image frames from a target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set;
splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model, and obtaining a process identification result output by the process identification model;
the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
According to a process identification method provided by the present invention, the process identification method for inputting the time-series state image into a process identification model and obtaining a process identification result output by the process identification model includes:
inputting the time sequence state image into a feature extraction layer of the process identification model, and performing feature extraction on each image pair by the feature extraction layer to obtain image pair features output by the feature extraction layer;
inputting the time-series state image into an attention layer of the process identification model, and performing attention calculation on a difference image of each image pair by the attention layer to obtain an attention mask output by the attention layer;
inputting the image pair features and the attention mask into a feature fusion layer of the process identification model to obtain fusion features output by the feature fusion layer;
and inputting the fusion features into a process identification layer of the process identification model to obtain the process identification result output by the process identification layer.
According to one aspect of the present invention, there is provided a process identification method, comprising, after obtaining a process identification result output by the process identification model, the steps of:
and if the process identification results of a preset number of continuous image frames to be identified are all target processes, taking the average time of all the continuous image frames to be identified as the process time of the target processes.
According to a process identification method provided by the invention, the sample time sequence state image comprises a positive sample time sequence state image; the positive sample timing state image is determined based on the steps of:
determining a front state image frame and a rear state image frame of a sample process node from a sample video;
determining a positive sample differential image based on the front state image frame and the back state image frame;
and splicing the front state image frame, the rear state image frame and the positive sample differential image to obtain the positive sample time sequence state image.
According to the process identification method provided by the invention, the image frame set is determined based on the following steps:
determining a video to be identified, and carrying out optical flow detection on each image frame in the video to be identified according to a preset image frame interval to obtain a change coefficient of each image frame;
adding a corresponding image frame to the set of image frames when the change coefficient is greater than a threshold.
According to a process identification method provided by the invention, the sample time sequence state image comprises a negative sample time sequence state image; the negative sample timing state image is determined based on the steps of:
filtering a front state image frame and a rear state image frame of a sample process node from a sample video to obtain a sample video frame set;
randomly extracting two images from the sample video frame set to serve as a first image frame and a second image frame;
determining a negative sample differential image based on the first image frame and the second image frame;
and splicing the first image frame, the second image frame and the negative sample differential image to obtain the negative sample time sequence state image.
According to the process identification method provided by the invention, the sample video comprises the image frames of the sample process nodes, or the sample video comprises the image frames of the sample process nodes and the image frames of non-sample process nodes.
The present invention also provides a process identifying apparatus, comprising:
the determining unit is used for determining an image frame to be identified from the image frame set;
the pairing unit is used for extracting paired image frames from the target image frame set according to a preset time interval and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set;
the identification unit is used for splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model and obtaining a process identification result output by the process identification model;
the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the process identification method.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the process identification method as described in any one of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the process identification method as described in any one of the above.
According to the process identification method, the device, the electronic equipment and the storage medium, attention calculation is carried out on the sequential state image through the process identification model, so that an attention mask for representing key feature changes of the image pair can be obtained, the attention mask can be used as prior information to identify difference information between the image pair in the sequential state image, key feature changes before and after a process with unfixed duration and large fluctuation can be accurately obtained, and a process identification result can be accurately obtained.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a process identification method provided by the present invention;
FIG. 2 is a second schematic flow chart of the process identification method of the present invention;
FIG. 3 is a schematic diagram of a positive sample timing state image acquisition method according to the present invention;
FIG. 4 is a schematic structural diagram of a process identification device provided in the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, two types of visual algorithms are mostly used for identifying processes, one is a method for detecting key features in an image or extracting the key features by using other methods and then classifying the key features, and the other is a method for identifying processes by using a video classification or detection method based on a video stream. However, the method is suitable for recognizing and analyzing events with short duration and small time fluctuation, such as a video recognition scheme for human action gestures, an automatic marking scheme for movie and television film titles and trailers, and the like, but is low in accuracy when facing the process recognition problems of large duration change and complex visual flow.
In view of this, the present invention provides a process identification method. Fig. 1 is a schematic flow chart of a process identification method provided by the present invention, and as shown in fig. 1, the method includes the following steps:
step 110, determining an image frame to be identified from the image frame set.
Here, the image frame to be recognized is an image of the process type to be recognized, the image frame set is an image frame set determined from the video to be recognized, and there is a difference between each image frame in the set, for example, optical flow detection may be performed on each image frame of the video to be recognized, and when a change coefficient of any image frame exceeds a threshold value, it is indicated that the image frame is different from the previous image frame, so that the image frame may be added to the image frame set.
It can be understood that after the image frame to be recognized is determined, the image frame to be recognized can be subjected to noise reduction processing, so that the noise influence of the image frame to be recognized can be eliminated, and the process recognition can be accurately performed on the basis of the image to be recognized.
Step 120, extracting paired image frames from the target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set.
Specifically, after the image frames to be recognized are determined, the image frames to be recognized can be regarded as post-state images of the process, then, the time corresponding to the image frames to be recognized is taken as a reference, an image frame set with the time between the image frames to be recognized is selected from the image frame set, then, paired image frames are extracted from the target image frame set according to a preset time interval, and each paired image frame and the image frame to be recognized are respectively combined into an image pair.
Step 130, splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into the process identification model, and obtaining a process identification result output by the process identification model;
the process identification model is obtained based on sample time sequence state images and sample process class label training, and the sample time sequence state images are obtained by splicing sample image pairs and sample difference images of the sample image pairs; the process identification model is used for performing attention calculation on the time-series state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
Specifically, after each image pair is obtained, the image frame to be identified in the image pair may be regarded as a post-state image of the process, and the other image frame in the image pair may be regarded as a pre-state image of the process, and then a differential image between the pre-state image and the post-state image, that is, a differential image of the image pair, is determined by using a first-order difference. The front state image refers to an image corresponding to the front process, and the rear state image refers to an image corresponding to the rear process.
After the difference images of the image pairs are determined, the image pairs and the difference images of the image pairs can be spliced in the depth direction, so that a 9-channel time sequence state image can be obtained, then the time sequence state image is input into a process identification model, attention calculation is carried out on the time sequence state image by the process identification model, the time sequence state image contains information of the difference images, and the difference images are used for representing difference information before and after a process, so that when attention calculation is carried out on the time sequence state image, the change of key features before and after the process can be focused, an attention mask used for representing the change of the key features can be obtained, and a process identification result can be obtained based on the attention mask.
Therefore, the process identification model carries out attention calculation on the time sequence state image, so that an attention mask for representing key feature changes before and after the process can be obtained, the key feature changes before and after the process with unfixed duration and large fluctuation can be accurately obtained, and the process category can be accurately identified. In addition, the process identification model takes the time sequence state image as input for identification, and does not take the video data as input for identification, so that the size of the model can be effectively reduced, the identification speed of the model is increased, and the resources occupied by model deployment are reduced.
Optionally, assuming that the image frame set is a fifo image queue Q with a capacity of C, the determination process of the image frame set is: capturing 1 image sample every s frames from the video to be identified for optical flow detection, putting the image sample into Q when the change of the previous image frame exceeds a threshold value, the image frame to be identified is marked as I (t), when I (t) is the C sample in Q, taking I (t) as a post-state image, selecting a pair of image frames in Q at a certain interval g to be combined with I (t) to form an image pair, such as [ I (t-g.s), [ I (t)) ], [ I (t-2. g.s), [ I (t)) ], and the like, and k ═ C/g pairs of image pairs, then 9-channel time sequence state images corresponding to each image pair are obtained and input into a process identification model, the process identification model determines the probability of the process category corresponding to the image frame to be identified, and the process type corresponding to the maximum probability is used as a process identification result of the image frame to be identified, and the corresponding confidence coefficient is determined.
Before the time sequence state image is input into the process identification model, the process identification model can be obtained through pre-training, and the method can be realized by executing the following steps: firstly, a large number of sample image pairs and sample differential images of the sample image pairs are collected and then spliced to obtain a sample time sequence state image, and a corresponding sample process class label is determined through manual marking. And then, training the initial model based on the sample time sequence state image and the sample process class label so as to obtain process identification.
According to the process identification method provided by the embodiment of the invention, the attention mask used for representing the key feature change of the image pair can be obtained by performing attention calculation on the time-series state image through the process identification model, and then the attention mask can be used as prior information to identify the difference information between the image pair in the time-series state image, so that the key feature change before and after the process with unfixed duration and large fluctuation can be accurately obtained, and the process identification result can be accurately obtained.
Based on the above embodiment, the inputting the time-series state image into the process identification model to obtain the process identification result output by the process identification model includes:
inputting the time sequence state image into a feature extraction layer of the process identification model, and performing feature extraction on each image pair by the feature extraction layer to obtain image pair features output by the feature extraction layer;
inputting the time sequence state image into an attention layer of the process identification model, and performing attention calculation on a difference image of each image pair by the attention layer to obtain an attention mask output by the attention layer;
inputting the image pair characteristics and the attention mask into a characteristic fusion layer of the process identification model to obtain fusion characteristics output by the characteristic fusion layer;
and inputting the fusion characteristics into a process identification layer of the process identification model to obtain a process identification result output by the process identification layer.
Specifically, the time-series state image is obtained by stitching the image pair and the difference image of each image pair, so that when the time-series state image is input to the feature extraction layer, the feature extraction layer can perform feature extraction on each image pair in the time-series state image to obtain image pair features, and the attention layer can perform attention calculation on the difference image of each image pair in the time-series state image to obtain an attention mask for representing key feature changes of the image pair.
Then, the image pair features and the attention mask are input into a feature fusion layer of the process identification model, and the image pair features and the attention mask are fused by the feature fusion layer to obtain fusion features, so that the process identification layer can prepare a process identification result based on the fusion features.
As shown in fig. 2, the image pair (IA _ pre, IA _ post) is first-order differenced to obtain a difference image (IA _ diff), the image pair is feature-extracted by the feature extraction layer to obtain an image pair feature F, the difference image is attention-calculated by the attention layer to obtain an attention Mask, the image pair feature F and the attention Mask are fused by the feature fusion layer to obtain a fusion feature, and the process identification layer performs identification based on the fusion feature to obtain a process identification result.
Based on any embodiment, after obtaining the process identification result output by the process identification model, the method further comprises:
and if the process identification results of a preset number of continuous image frames to be identified are all target processes, taking the average time of all the continuous image frames to be identified as the process time of the target processes.
Specifically, the same characteristic information may occur in a certain image frame between different processes, that is, overlapping portions may exist between different processes, and in order to avoid the problem of false detection and inaccurate prediction time which may be caused by the overlapping portions of different processes, in the embodiment of the present invention, the process identification results of a preset number of consecutive image frames to be identified are all target processes, that is, the process identification results of consecutive image frames to be identified are the same, which indicates that the confidence of the process identification results is higher, so that the target process may be used as the final process identification result, and the average time of all consecutive image frames to be identified may be used as the process time of the target process.
For example, when m consecutive image frames are predicted as the type pred _ i, the voting mechanism is used to take the average of m times, i.e., (t) + (t + s) + (t +2 s) + … + (t + (m-1) s))/m, as the finally identified process time of the pred _ i type process.
In any of the above embodiments, the sample timing state image comprises a positive sample timing state image; the positive sample timing state image is determined based on the following steps:
determining a front state image frame and a rear state image frame of a sample process node from a sample video;
determining a positive sample differential image based on the front state image frame and the rear state image frame;
and splicing the front state image frame, the rear state image frame and the positive sample differential image to obtain a positive sample time sequence state image.
Specifically, as shown in fig. 3, assuming that there are k processes to be identified, a front state image frame IA _ pre1 and a rear state image frame IA _ post1 of a process node a to be identified are captured from a sample video, a positive sample difference image IA _ diff1 of the front and rear state image frames is obtained by using a first order difference, and 3 images are stitched in the depth direction to obtain a 9-channel positive sample time sequence state image IA1 ═ IA _ pre1, IA _ post1, IA _ diff 1. The previous state image frame refers to an image before the sample process node, and the subsequent state image frame refers to an image after the sample process node.
In any of the above embodiments, the sample timing state image comprises a negative sample timing state image; the negative sample time series state image is determined based on the following steps:
filtering a front state image frame and a rear state image frame of a sample process node from a sample video to obtain a sample video frame set;
randomly extracting two images from a sample video frame set to serve as a first image frame and a second image frame;
determining a negative sample difference image based on the first image frame and the second image frame;
and splicing the first image frame, the second image frame and the negative sample differential image to obtain a negative sample time sequence state image.
Specifically, a front state image frame and a rear state image frame of a sample process node are filtered from a sample video to obtain a sample video frame set, and then two images are randomly extracted from the video frame set to serve as a first image frame and a second image frame, and a difference image of the two images serves as a negative sample difference image. And splicing the first image frame, the second image frame and the negative sample differential image along the depth direction, wherein the first image frame, the second image frame and the negative sample differential image are RGB images of 3 channels respectively, so that the obtained negative sample time sequence state image is a time sequence state image of 9 channels. Wherein the content of the first and second substances,
in any of the above embodiments, the sample video includes image frames of sample process nodes, or the sample video includes image frames of sample process nodes and image frames of non-sample process nodes.
Specifically, the sample video may include image frames of the sample process node, so that the positive sample time sequence state image may extract a front state image frame and a rear state image frame from the image frames of the sample process node to obtain a positive sample time sequence state image, and the negative sample time sequence state image may be obtained by filtering the front state image frame and the rear state image frame from the image frames of the sample process node and then randomly extracting two image frames.
The sample video can comprise image frames of sample process nodes and image frames of non-sample process nodes, so that the positive sample time sequence state image can extract a front state image frame and a rear state image frame from the image frames of the sample process nodes to obtain a positive sample time sequence state image, the negative sample time sequence state image can be obtained by randomly extracting two image frames after filtering the front state image frame and the rear state image frame from the image frames of the sample process nodes, and can also be obtained by randomly extracting two image frames from the image frames of the non-sample process nodes. The image frames of the non-sample process nodes may be image frames that do not need process identification.
According to any of the above embodiments, the image frame set is determined based on the following steps:
determining a video to be identified, and carrying out optical flow detection on each image frame in the video to be identified according to a preset image frame interval to obtain a change coefficient of each image frame;
when the change coefficient is greater than the threshold, the corresponding image frame is added to the set of image frames.
Specifically, the video to be recognized refers to a video to be subjected to process recognition. Because the to-be-processed video comprises a plurality of image frames, the similarity between two adjacent image frames is relatively high, and if each image frame is identified, the resource calculation amount is wasted. Therefore, in the embodiment of the invention, the optical flow detection is carried out on each image frame in the video to be recognized according to the preset image frame interval, so as to obtain the change coefficient of each image frame, when the change coefficient is greater than the threshold value, the difference between the corresponding image frame and the previous image frame is larger, and the image frame can be used for process recognition, so that the corresponding image frame can be added to the image frame set.
The process identification device provided by the present invention is described below, and the process identification device described below and the process identification method described above may be referred to in correspondence with each other.
Based on any of the above embodiments, the present invention further provides a process identification apparatus, as shown in fig. 4, the apparatus including:
a determining unit 410 for determining an image frame to be identified from the set of image frames;
the pairing unit 420 is configured to extract paired image frames from a target image frame set according to a preset time interval, and combine each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set;
an identifying unit 430, configured to splice each image pair and the difference image of each image pair to obtain a time-series state image, input the time-series state image to a process identification model, and obtain a process identification result output by the process identification model;
the process identification model is obtained by training based on a sample time sequence state image and a sample process type label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair.
Based on any of the above embodiments, the identifying unit 430 includes:
a feature extraction unit configured to input the time-series state image into a feature extraction layer of the process identification model, and perform feature extraction on each image pair by the feature extraction layer to obtain an image pair feature output by the feature extraction layer;
an attention unit configured to input the time-series state image to an attention layer of the process recognition model, and perform attention calculation on a difference image of each image pair by the attention layer to obtain an attention mask output by the attention layer;
a fusion unit configured to input the image pair features and the attention mask to a feature fusion layer of the process identification model to obtain fusion features output by the feature fusion layer;
and the process identification unit is used for inputting the fusion characteristics to a process identification layer of the process identification model and obtaining the process identification result output by the process identification layer.
Based on any embodiment above, still include:
and the process time determining unit is used for taking the average time of all the continuous image frames to be identified as the process time of the target process if the process identification results of a preset number of continuous image frames to be identified are all the target processes after the process identification results output by the process identification model are obtained.
In any of the above embodiments, the sample timing state image comprises a positive sample timing state image; the device further comprises:
the first state image frame determining unit is used for determining a front state image frame and a rear state image frame of a sample process node from a sample video;
a positive sample difference image unit for determining a positive sample difference image based on the front state image frame and the rear state image frame;
and the positive sample time sequence state determining unit is used for splicing the front state image frame, the rear state image frame and the positive sample differential image to obtain the positive sample time sequence state image.
According to any of the above embodiments, the sample timing state image comprises a negative sample timing state image; the device further comprises:
the filtering unit is used for filtering a front state image frame and a rear state image frame of a sample process node from a sample video to obtain a sample video frame set;
a second state image frame determining unit, configured to randomly extract two images from the sample video frame set as a first image frame and a second image frame;
a negative sample difference image determining unit for determining a negative sample difference image based on the first image frame and the second image frame;
and the negative sample time sequence state image determining unit is used for splicing the first image frame, the second image frame and the negative sample difference image to obtain the negative sample time sequence state image.
In any of the above embodiments, the sample video includes image frames of the sample process node, or the sample video includes image frames of the sample process node and image frames of non-sample process nodes.
Based on any embodiment above, the apparatus further comprises:
the device comprises a change coefficient determining unit, a processing unit and a processing unit, wherein the change coefficient determining unit is used for determining a video to be identified and carrying out optical flow detection on each image frame in the video to be identified according to a preset image frame interval to obtain a change coefficient of each image frame;
an image frame set determining unit for adding a corresponding image frame to the image frame set when the variation coefficient is greater than a threshold value.
Fig. 5 is a schematic structural diagram of an electronic device provided in the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)510, a communication Interface (Communications Interface)520, a memory (memory)530 and a communication bus 540, wherein the processor 510, the communication Interface 520 and the memory 530 communicate with each other via the communication bus 540. Processor 510 may call logic instructions in memory 530 to perform a procedure identification method comprising: determining an image frame to be identified from an image frame set; extracting paired image frames from a target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set; splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model, and obtaining a process identification result output by the process identification model; the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
Furthermore, the logic instructions in the memory 530 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the process identification method provided by the above methods, the method comprising: determining an image frame to be identified from an image frame set; extracting paired image frames from a target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set; splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model, and obtaining a process identification result output by the process identification model; the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the process identification methods provided above, the method comprising: determining an image frame to be identified from an image frame set; extracting paired image frames from a target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set; splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model, and obtaining a process identification result output by the process identification model; the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A process identification method, comprising:
determining an image frame to be identified from an image frame set;
extracting paired image frames from a target image frame set according to a preset time interval, and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set;
splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model, and obtaining a process identification result output by the process identification model;
the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
2. The process identifying method according to claim 1, wherein the step of inputting the time-series state image into a process identifying model and obtaining a process identifying result output by the process identifying model comprises:
inputting the time sequence state image into a feature extraction layer of the process identification model, and performing feature extraction on each image pair by the feature extraction layer to obtain image pair features output by the feature extraction layer;
inputting the time-series state image into an attention layer of the process identification model, and performing attention calculation on a difference image of each image pair by the attention layer to obtain an attention mask output by the attention layer;
inputting the image pair features and the attention mask into a feature fusion layer of the process identification model to obtain fusion features output by the feature fusion layer;
and inputting the fusion features into a process identification layer of the process identification model to obtain the process identification result output by the process identification layer.
3. The process identifying method according to claim 1, further comprising, after obtaining a process identification result output by the process identification model:
and if the process identification results of a preset number of continuous image frames to be identified are all target processes, taking the average time of all the continuous image frames to be identified as the process time of the target processes.
4. The process identifying method according to claim 1, wherein the sample time series state image includes a positive sample time series state image; the positive sample timing state image is determined based on the steps of:
determining a front state image frame and a rear state image frame of a sample process node from a sample video;
determining a positive sample differential image based on the front state image frame and the back state image frame;
and splicing the front state image frame, the rear state image frame and the positive sample differential image to obtain the positive sample time sequence state image.
5. The process identification method according to claim 1, wherein the set of image frames is determined based on the steps of:
determining a video to be identified, and carrying out optical flow detection on each image frame in the video to be identified according to a preset image frame interval to obtain a change coefficient of each image frame;
adding a corresponding image frame to the set of image frames when the change coefficient is greater than a threshold.
6. The process identifying method according to any one of claims 1 to 5, wherein the sample time-series state image includes a negative sample time-series state image; the negative sample timing state image is determined based on the steps of:
filtering a front state image frame and a rear state image frame of a sample process node from a sample video to obtain a sample video frame set;
randomly extracting two images from the sample video frame set to serve as a first image frame and a second image frame;
determining a negative sample differential image based on the first image frame and the second image frame;
and splicing the first image frame, the second image frame and the negative sample differential image to obtain the negative sample time sequence state image.
7. The process identifying method according to claim 6, wherein the sample video includes image frames of the sample process node, or the sample video includes image frames of the sample process node and image frames of a non-sample process node.
8. A process identification device, comprising:
the determining unit is used for determining an image frame to be identified from the image frame set;
the pairing unit is used for extracting paired image frames from the target image frame set according to a preset time interval and combining each paired image frame and the image frame to be identified into an image pair; the target image frame set is an image frame set which is positioned before the image frame to be identified at the moment in the image frame set;
the identification unit is used for splicing each image pair and the difference image of each image pair to obtain a time sequence state image, inputting the time sequence state image into a process identification model and obtaining a process identification result output by the process identification model;
the process identification model is obtained by training based on a sample time sequence state image and a sample process class label, wherein the sample time sequence state image is obtained by splicing a sample image pair and a sample difference image of the sample image pair; the process identification model is used for performing attention calculation on the time sequence state image to obtain an attention mask, and obtaining the process identification result based on the attention mask.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the process identification method according to any one of claims 1 to 7 are implemented when the program is executed by the processor.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the process identification method according to any one of claims 1 to 7.
CN202111436226.2A 2021-11-29 2021-11-29 Process identification method, process identification device, electronic device, and storage medium Pending CN114241363A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111436226.2A CN114241363A (en) 2021-11-29 2021-11-29 Process identification method, process identification device, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111436226.2A CN114241363A (en) 2021-11-29 2021-11-29 Process identification method, process identification device, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
CN114241363A true CN114241363A (en) 2022-03-25

Family

ID=80751924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111436226.2A Pending CN114241363A (en) 2021-11-29 2021-11-29 Process identification method, process identification device, electronic device, and storage medium

Country Status (1)

Country Link
CN (1) CN114241363A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116587043A (en) * 2023-07-18 2023-08-15 太仓德纳森机电工程有限公司 Workpiece conveying system for industrial automatic production and processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116587043A (en) * 2023-07-18 2023-08-15 太仓德纳森机电工程有限公司 Workpiece conveying system for industrial automatic production and processing
CN116587043B (en) * 2023-07-18 2023-09-15 太仓德纳森机电工程有限公司 Workpiece conveying system for industrial automatic production and processing

Similar Documents

Publication Publication Date Title
CN109284729B (en) Method, device and medium for acquiring face recognition model training data based on video
US20160098636A1 (en) Data processing apparatus, data processing method, and recording medium that stores computer program
CN111798456A (en) Instance segmentation model training method and device and instance segmentation method
CN110348345B (en) Weak supervision time sequence action positioning method based on action consistency
CN110674886B (en) Video target detection method fusing multi-level features
CN112989950A (en) Violent video recognition system oriented to multi-mode feature semantic correlation features
WO2023040146A1 (en) Behavior recognition method and apparatus based on image fusion, and electronic device and medium
WO2022213540A1 (en) Object detecting, attribute identifying and tracking method and system
CN109753984A (en) Video classification methods, device and computer readable storage medium
CN112434178A (en) Image classification method and device, electronic equipment and storage medium
CN112507860A (en) Video annotation method, device, equipment and storage medium
CN111598833B (en) Method and device for detecting flaws of target sample and electronic equipment
JP2018005638A (en) Image recognition model learning device, image recognition unit, method and program
CN110175553B (en) Method and device for establishing feature library based on gait recognition and face recognition
CN111611944A (en) Identity recognition method and device, electronic equipment and storage medium
CN114241363A (en) Process identification method, process identification device, electronic device, and storage medium
CN114419493A (en) Image annotation method and device, electronic equipment and storage medium
CN112633100B (en) Behavior recognition method, behavior recognition device, electronic equipment and storage medium
CN112949456A (en) Video feature extraction model training method and device, and video feature extraction method and device
CN115424253A (en) License plate recognition method and device, electronic equipment and storage medium
CN111667419A (en) Moving target ghost eliminating method and system based on Vibe algorithm
CN116580232A (en) Automatic image labeling method and system and electronic equipment
CN115719428A (en) Face image clustering method, device, equipment and medium based on classification model
CN111582031B (en) Multi-model collaborative violence detection method and system based on neural network
CN113435248A (en) Mask face recognition base enhancement method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination