CN117789040A - Tea bud leaf posture detection method under disturbance state - Google Patents

Tea bud leaf posture detection method under disturbance state Download PDF

Info

Publication number
CN117789040A
CN117789040A CN202410217204.4A CN202410217204A CN117789040A CN 117789040 A CN117789040 A CN 117789040A CN 202410217204 A CN202410217204 A CN 202410217204A CN 117789040 A CN117789040 A CN 117789040A
Authority
CN
China
Prior art keywords
tea
target
leaf
model
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410217204.4A
Other languages
Chinese (zh)
Other versions
CN117789040B (en
Inventor
吴伟斌
陈天赐
吕金洪
李浩欣
曾治亨
韩重阳
唐婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN202410217204.4A priority Critical patent/CN117789040B/en
Publication of CN117789040A publication Critical patent/CN117789040A/en
Application granted granted Critical
Publication of CN117789040B publication Critical patent/CN117789040B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a tea bud leaf gesture detection method under a disturbance state, which comprises the steps of collecting an interference video data set, wherein the interference video data set contains disturbed tea buds. And inputting the interference video data set into a target detection model to carry out target detection, so as to obtain tea bud and leaf target information. And inputting the tea bud and leaf target information into a target tracking model, and tracking the tea bud and leaf to obtain a detection frame set. Inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set to obtain a key point set. Based on the key point set and the continuous space-time sequence, the posture of the tea bud leaf in the disturbance state is detected. Under the conditions of continuous time and continuous space, key points of the same tea bud leaves at different moments can be extracted from the key point set, and the change of the key points can reflect the change of the posture of the tea bud leaves, so that the disturbance condition of the tea bud leaves is reflected, and the posture of the tea bud leaves in a disturbance state is accurately detected.

Description

Tea bud leaf posture detection method under disturbance state
Technical Field
The invention relates to the technical field of tea bud leaf detection, in particular to a tea bud leaf posture detection method under a disturbance state.
Background
The famous tea has higher nutrition and is mainly dependent on manual picking at present. Under the background of gradual shortage of labor force, the research and development of the tea picking robot to replace manual picking has important significance.
The famous tea mainly takes one bud and one leaf as main materials, the tea buds and leaves are light in weight, and are easy to be disturbed in natural environment, for example, the tea buds and leaves are disturbed by wind and the different tea buds and leaves collide when picking nearby tea buds and leaves. The tea bud leaves can swing when disturbed, so that the gesture determination of the tea picking robot and the positioning of picking points are affected, and the tea bud leaves are a great challenge for the tea picking robot.
Therefore, there is a need for a method that can accurately detect the posture of tea shoots when they are disturbed.
Disclosure of Invention
In order to overcome the problems in the related art, the invention aims to provide a tea bud and leaf gesture detection method in a disturbance state, which can accurately detect the tea bud and leaf gesture when the tea bud and leaf are disturbed.
A tea bud leaf gesture detection method under a disturbance state comprises the following steps:
collecting an interference video data set, wherein the interference video data set comprises disturbed tea buds and leaves;
inputting the interference video data set into a target detection model to carry out target detection to obtain tea bud and leaf target information;
inputting the tea bud and leaf target information into a target tracking model, and tracking the tea bud and leaf to obtain a detection frame set;
inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set to obtain a key point set;
and detecting the posture of the tea bud leaf in the disturbance state based on the key point set and the continuous space-time sequence.
In a preferred technical scheme of the invention, the detecting the posture of the tea bud leaf in the disturbance state based on the key point set and the continuous space-time sequence comprises the following steps:
screening out target key points conforming to continuous space-time sequences from the key point set; the continuous space-time sequence is a sequence which is continuous in time and continuous in space;
connecting adjacent target key points by using a first type of connecting line, and restoring the appearance of the tea bud leaves;
connecting the same target key points of adjacent image frames in the interference video by using a second type of connecting line to construct a tea bud leaf time-space diagram; the interference video is a video in the interference video data set;
and capturing the posture change of the tea bud leaves in a disturbance state by combining the appearance of the tea bud leaves and the tea bud leaf time-space diagram.
In a preferred technical scheme of the invention, the step of inputting the tea bud and leaf target information into a target tracking model and tracking the tea bud and leaf to obtain a detection frame set comprises the following steps:
splitting the interference video in the interference video data set according to frames to obtain multi-frame disturbance images;
inputting the initial position and the category of the tea bud leaves into a target tracking model, and continuously tracking the disturbance images of a first frame number to obtain a detection frame set.
In a preferred technical scheme of the present invention, the inputting the interference video data set into a target detection model for target detection to obtain tea bud and leaf target information includes:
inputting the interference video data set into a target detection model, and representing the characteristics of each image area through position coding;
capturing feature correlations between features of different ones of the image regions based on a self-attention mechanism;
identifying a target location and a target category based on the feature and the feature relevance of each of the image areas;
drawing a detection frame according to the target position; and the detection frame and the target category form tea bud leaf target information.
In a preferred technical scheme of the present invention, the inputting the detection frame set into a key point detection model, extracting key points in the detection frame set, and obtaining a key point set includes:
inputting the detection frame set into a key point detection model, and detecting key points in each detection frame;
and forming all the key points into a key point set.
In a preferred technical solution of the present invention, before the inputting the interference video data set into the target detection model for target detection, the method further includes:
constructing a target detection model to be trained, wherein the target detection model to be trained adopts a transducer model;
and training the target detection model to be trained by adopting a first training set to obtain the target detection model.
In a preferred technical scheme of the invention, before the tea bud and leaf target information is input into the target tracking model, the method further comprises:
constructing a target tracking model to be trained, wherein the target tracking model to be trained adopts a deep start model;
and training the target tracking model to be trained by adopting a second training set to obtain the target tracking model.
In a preferred technical solution of the present invention, before the inputting the detection frame set into the keypoint detection model, the method further includes:
constructing a key point detection model to be trained, wherein the key point detection model to be trained adopts an alpha Pose model;
and training the key point detection model to be trained by adopting a label data set to obtain the key point detection model.
In a preferred technical solution of the present invention, before the training of the key point detection model to be trained by using the tag data set, the method further includes:
the label data set is manufactured, and the label data set comprises a boundary box label and a key point label; the key point labels comprise a bud top label point, a bud label point, a leaf top label point, a leaf label point, a bud-leaf cross point label point and a stalk point label point.
In a preferred embodiment of the present invention, the capturing an interference video data set includes:
disturbing the tea buds and leaves by adopting disturbance equipment;
and shooting videos of the tea buds and leaves at different distances by using a mobile camera to obtain an interference video data set.
The beneficial effects of the invention are as follows:
the tea bud leaf gesture detection method under the disturbance state comprises the steps of collecting an interference video data set, wherein the interference video data set contains disturbed tea bud leaves, and the disturbance comprises wind disturbance and collision between the tea bud leaves. Inputting the interference video data set into a target detection model to carry out target detection, detecting whether tea buds and leaves exist in the interference video data set, and if so, positioning the tea buds and leaves to obtain tea bud and leaf target information, wherein the tea bud and leaf target information comprises a detection frame and a target class. And inputting the tea bud and leaf target information into a target tracking model, tracking the tea bud and leaf, and detecting the change of the same tea bud and leaf in continuous multi-frame images to obtain a detection frame set. Inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set according to time sequence to obtain a key point set. Based on the key point set and the continuous space-time sequence, the posture of the tea bud leaf in the disturbance state is detected. Under the conditions of continuous time and continuous space, key points of the same tea bud leaves at different moments can be extracted from the key point set, and the change of the key points can reflect the change of the posture of the tea bud leaves, so that the disturbance condition of the tea bud leaves is reflected, and the posture of the tea bud leaves in a disturbance state is accurately detected.
Drawings
FIG. 1 is a schematic flow chart of a tea bud leaf gesture detection method under a disturbance state;
FIG. 2 is a schematic flow chart for detecting the posture of tea buds and leaves in a disturbance state;
FIG. 3 is a schematic flow chart of model training provided by the present invention;
fig. 4 is a tea bud leaf time space diagram provided by the invention.
Detailed Description
Preferred embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While the preferred embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Example 1
As shown in fig. 1, the present embodiment provides a method for detecting a tea bud and leaf gesture in a disturbance state, including:
s1: an interference video dataset is acquired, the interference video dataset comprising perturbed tea shoots and leaves.
Step S1 includes the following steps S11-S12:
s11: and disturbing the tea buds and leaves by adopting disturbance equipment.
S12: and shooting videos of the tea buds and leaves at different distances by using a mobile camera to obtain an interference video data set.
Disturbance equipment such as a blower is used for disturbing tea buds and leaves, handheld mobile camera sensing equipment is adopted for adjusting different distances, videos of tea buds and leaves in a tea garden environment are collected, and a disturbance video data set is obtained.
S2: and inputting the interference video data set into a target detection model to carry out target detection, so as to obtain tea bud and leaf target information.
Step S2 includes the following steps S21-S24:
s21: the set of interfering video data is input into a target detection model representing the characteristics of each image region by position coding.
S22: feature correlations between features of different ones of the image regions are captured based on a self-attention mechanism.
S23: a target location and a target category are identified based on the feature and the feature relevance of each of the image regions.
S24: drawing a detection frame according to the target position; and the detection frame and the target category form tea bud leaf target information.
The target detection module is based on a transducer model, and the target detection model comprises a plurality of convolution layers and a plurality of full-connection layers, wherein the convolution layers are used for extracting feature images, and the full-connection layers are used for converting the feature images into detection frames and target categories.
The transducer model has no fixed perceived order of elements in the sequence by a self-attention mechanism. It is therefore necessary to use position coding so that the transducer model can distinguish the relative positions of the elements in the sequence. In image processing, position coding provides spatial position information for the transducer model. The manner of position coding includes two-bit position coding, absolute position coding, relative position coding, position embedding, and convolution operations.
Absolute position coding is to assign a fixed code to each position in a sequence, which code indicates the exact position of that position in the whole sequence, independent of other positions. The relative position code does not directly code each position, but codes the relative distance between two positions.
The transducer model includes an encoder and a decoder, each encoder including a multi-headed self-attention sub-layer and a feed-forward neural network sub-layer, the multi-headed attention sub-layer being used to correlate the vectors for each position in the input sequence to produce a new set of vector representations. The multi-headed self-attention sub-layer uses a plurality of attention mechanisms, each attention mechanism focusing on a different position of the input sequence, and the feedforward neural network sub-layer is used to non-linearly transform the output of the multi-headed self-attention sub-layer. In each encoder, residual connection and layer normalization techniques are introduced. The residual connection is to add the input sequence directly to the output of the sub-layer so that the information can be transferred faster. The layer normalization is to normalize the output of each sub-layer so as to make the input between different layers more uniform, thereby accelerating the convergence speed of the transducer model.
The decoder in the transducer model is made up of multiple decoders, each including a multi-headed self-attention sub-layer, an encoder-decoder attention sub-layer, and a feed-forward neural network sub-layer. The multi-headed self-attention sub-layer is used to correlate sequence positions that have been generated in the decoder, resulting in a new set of vector representations. The encoder-decoder attention sub-layer is used to correlate a set of vector representations generated in the encoder with sequence positions already generated in the decoder, resulting in a new set of vector representations. The attention mechanism of this sub-layer is similar to multi-head attention, but it focuses on the output of the encoder, not the decoder. The feed-forward neural network sublayer is used to non-linearly transform the outputs of the multi-headed self-attention sublayer and the encoder-decoder attention sublayer.
Residual connection and layer normalization techniques are introduced in each decoder to accelerate the convergence of the model.
S3: and inputting the tea bud and leaf target information into a target tracking model, and tracking the tea bud and leaf to obtain a detection frame set.
Step S3 includes the following steps S31-S32:
s31: and splitting the interference video in the interference video data set according to frames to obtain multi-frame disturbance images.
S32: inputting the initial position and the category of the tea bud leaves into a target tracking model, and continuously tracking the disturbance images of a first frame number to obtain a detection frame set.
In this embodiment, the first frame number is set to 20 frames, and the target tracking model adopts the deep model. The disturbance state mainly emphasizes the condition that tea buds and leaves are subjected to disturbance change, and the disturbance change is time-consuming, so that a target tracking model is used for determining that 20 frames of disturbance images are continuously tracked, and meanwhile, the tea bud and leaf postures in the 20 frames of disturbance images are acquired for analysis.
The set of detection frames may reflect the change in position of the same tea bud leaf in 20 consecutive perturbation images, e.g., tea bud leaf numbered 1 is located in the upper left corner of the first perturbation image, in the upper right corner of the 10 th perturbation image, and in the lower right corner of the 20 th perturbation image.
S4: inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set to obtain a key point set.
Step S4 includes the following steps S41-S42:
s41: and inputting the detection frame set into a key point detection model, and detecting the key point in each detection frame.
S42: and forming all the key points into a key point set.
The key point detection model in this embodiment adopts an alphaPose model, and defines 6 key points for tea buds and leaves, namely, bud tops, bud middle, leaf tops, leaf middle, bud-leaf intersection points and stalk points.
Inputting each detection box in the detection box set into an alpha Pose model, and extracting key points of tea buds and leaves with different postures. The extraction process of the key points is to predict the position of the key points in each detection frame, and a confidence is output to the predicted position of each key point, wherein the confidence range is 0-1. The confidence and the accuracy of predicting the position of the key point are in positive correlation, namely, the higher the confidence is, the higher the accuracy of predicting the position of the key point is.
The principle of detecting the detection frame containing tea buds and leaves and then estimating the positions of key points in the detection frame by using the alpha Pose model is to use priori information to assist in posture estimation, and the key point detection method is high in efficiency and can keep stability in complex scenes.
Preferably, the target detection model, the target tracking model and the key point detection model are combined to obtain an end-to-end network. In the end-to-end network, the object detection model, the object tracking model, and the keypoint detection model share a feature extractor, but the header of each model is independent to handle different tasks separately.
The coupling occurs on a shared feature extractor, which allows different tasks to share the same low-level feature representation, thereby making better use of the parameters of the model and training data.
The header of each model requires some additional network layers to handle specific tasks, e.g., the header of the object detection model includes some convolution layers and full connection layers for converting the feature map into object boxes representing tea bud leaf locations and object categories such as tea bud leaf varieties. The header of the target tracking model includes layers required by some tracking algorithms for correlating the detected targets to the states of the tracker. The head of the keypoint detection model includes some layers for estimating the keypoint location or pose parameters.
S5: and detecting the posture of the tea bud leaf in the disturbance state based on the key point set and the continuous space-time sequence.
As shown in fig. 2, step S5 includes steps S51 to S54:
step S51: screening out target key points conforming to continuous space-time sequences from the key point set; the continuous spatio-temporal sequence is a sequence that is time continuous and spatially continuous.
S52: and connecting adjacent target key points by using a first type of connecting line, and restoring the appearance of the tea bud leaves.
S53: connecting the same target key points of adjacent image frames in the interference video by using a second type of connecting line to construct a tea bud leaf time-space diagram; the interfering video is a video in the interfering video dataset.
S54: and capturing the posture change of the tea bud leaves in a disturbance state by combining the appearance of the tea bud leaves and the tea bud leaf time-space diagram.
And determining a continuous space-time sequence according to the shooting time of the image frames in the interference video data set, and screening out target key points in the key point set according to the continuous space-time sequence, namely adopting the continuous space-time sequence to restrain the key points so that the screened target key points accord with the time continuity and the space continuity. For example, starting from the 1 st minute to the 10 th minute, the key points within the same area range are regarded as target key points.
The target key points correspond to the original labels, and the first type connecting lines are used for connecting the adjacent target key points to restore the appearance of the tea buds. As shown in fig. 4, the second type of connection line is used to connect the same target key points of adjacent image frames in the interference video, so as to construct a tea bud leaf time-space diagram. And comparing the constructed tea bud leaf time space diagram with the appearance of the restored tea bud leaf, and determining the postures of the tea bud leaf at different moments. Because the target key points accord with the time continuity and the space continuity, the postures of the tea buds and the leaves at different moments are arranged according to the time sequence, and the posture change of the tea buds and the leaves in a period of time can be obtained, so that the condition that the tea buds and the leaves are disturbed is reflected, and the posture of the tea buds and the leaves in the disturbance state is accurately detected.
Preferably, the pose of the tea bud leaf in the disturbance state is input into a classifier, the extracted tea bud She Zitai is mapped to different motion categories, and the probability distribution of each motion category is output.
The tea bud leaf gesture detection method under the disturbance state provided by the embodiment comprises the steps of collecting an interference video data set, wherein the interference video data set contains disturbed tea bud leaves, and the disturbance comprises wind disturbance and collision between the tea bud leaves. Inputting the interference video data set into a target detection model to carry out target detection, detecting whether tea buds and leaves exist in the interference video data set, and if so, positioning the tea buds and leaves to obtain tea bud and leaf target information, wherein the tea bud and leaf target information comprises a detection frame and a target class. And inputting the tea bud and leaf target information into a target tracking model, tracking the tea bud and leaf, and detecting the change of the same tea bud and leaf in continuous multi-frame images to obtain a detection frame set. Inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set according to time sequence to obtain a key point set. Based on the key point set and the continuous space-time sequence, the posture of the tea bud leaf in the disturbance state is detected. Under the conditions of continuous time and continuous space, key points of the same tea bud leaves at different moments can be extracted from the key point set, and the change of the key points can reflect the change of the posture of the tea bud leaves, so that the disturbance condition of the tea bud leaves is reflected, and the posture of the tea bud leaves in a disturbance state is accurately detected.
Example 2
This embodiment only describes the differences from embodiment 1, and as shown in fig. 3, before the inputting the interference video data set into the object detection model for object detection, the method further includes:
s11': and constructing a target detection model to be trained, wherein the target detection model to be trained adopts a transducer model.
S12': and training the target detection model to be trained by adopting a first training set to obtain the target detection model.
The first training set includes labeled detection boxes and target categories, e.g., the first training set includes 1000 labeled detection boxes, the target category of the first labeled detection box is a second type of tea bud leaf, the target category of the second labeled detection box is a first type of tea bud leaf, and the target category of the third labeled detection box is a third type of tea bud leaf. The first loss function consists of category prediction loss, bounding box prediction loss, target confidence loss, and IoU loss.
Training the target detection model to be trained by using a supervised learning mode, and stopping training when the training times are greater than or equal to a first training times threshold or the first loss function value is less than or equal to a first loss function threshold to obtain the target detection model.
Before the tea bud and leaf target information is input into the target tracking model, the method further comprises the following steps:
s13': and constructing a target tracking model to be trained, wherein the target tracking model to be trained adopts a deep Sort model.
S14': and training the target tracking model to be trained by adopting a second training set to obtain the target tracking model.
The second training set includes tea bud leaf target information of the same tea bud leaf in a period of time, namely the positions and target types of the same tea bud leaf at different moments. For example, the first tea bud leaf at 1 min is in the upper left corner of the image, the first tea bud leaf at 10 min is in the upper right corner of the image, and the first tea bud leaf at 20 min is in the lower right corner of the image.
The second loss function consists of a positioning loss, expressed as a squared difference loss, and an associated loss, for comparing the distance between the predicted and actual positions of the target. Correlation loss typically minimizes the distance or difference between objects using a matching algorithm, such as the hungarian algorithm, so that the tracking results are more consistent and accurate.
And stopping training when the training times of the target tracking model to be trained is greater than or equal to a second training times threshold or the second loss function is smaller than or equal to a second loss function threshold, so as to obtain the target tracking model.
Before the detection frame set is input into the key point detection model, the method further comprises the following steps:
s15': and constructing a key point detection model to be trained, wherein the key point detection model to be trained adopts an alpha Pose model.
S16': and training the key point detection model to be trained by adopting a label data set to obtain the key point detection model.
Before the training of the key point detection model to be trained by adopting the label data set, the method further comprises the following steps:
the label data set is manufactured, and the label data set comprises a boundary box label and a key point label; the key point labels comprise a bud top label point, a bud label point, a leaf top label point, a leaf label point, a bud-leaf cross point label point and a stalk point label point.
And training the key point detection model to be trained by using the label data set, wherein a third loss function corresponding to the key point detection model to be trained is a mean square error loss function, so that the difference between the predicted key point position and the actual position is measured.
And stopping training when the training times of the key point detection model to be trained are greater than or equal to a third training times threshold value or the third loss function is smaller than or equal to a third loss function threshold value, so as to obtain the key point detection model.
Preferably, the target detection model to be trained, the target tracking model to be trained and the key point detection model to be trained are combined to obtain the network to be trained, and the network to be trained is subjected to end-to-end training.
In the embodiment, a transducer model is used as a target detection model to be trained, and the target detection model to be trained is trained by adopting a first training set, so that the target detection model is obtained. And training the target tracking model to be trained by adopting a second training set by using the deep start model as the target tracking model to be trained, so as to obtain the target tracking model. And training the key point detection model to be trained by adopting a label data set by using the alpha phase model as the key point detection model to be trained, so as to obtain the key point detection model. The label data set comprises a boundary box label and a key point label, and the trained key point detection model can capture motion information in a disturbance state.
Example 3
The embodiment provides an electronic device, which comprises a memory and a processor.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include various types of storage units, such as system memory, read Only Memory (ROM), and persistent storage.
The memory stores executable code which, when processed by the processor, causes the processor to perform part or all of the above-described method for detecting tea shoot leaf posture in a disturbance state.
The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise. In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
It will be understood that the spatially relative terms are intended to encompass different orientations in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "above" or "over" other devices or structures would then be oriented "below" or "beneath" the other devices or structures. Thus, the exemplary term "above … …" may include both orientations of "above … …" and "below … …". The device may also be positioned in other different ways (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
In addition, the terms "first", "second", etc. are used to define the components, and are merely for convenience of distinguishing the corresponding components, and unless otherwise stated, the terms have no special meaning, and thus should not be construed as limiting the scope of the present application.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The tea bud leaf posture detection method under the disturbance state is characterized by comprising the following steps of:
collecting an interference video data set, wherein the interference video data set comprises disturbed tea buds and leaves;
inputting the interference video data set into a target detection model to carry out target detection to obtain tea bud and leaf target information;
inputting the tea bud and leaf target information into a target tracking model, and tracking the tea bud and leaf to obtain a detection frame set;
inputting the detection frame set into a key point detection model, and extracting key points in the detection frame set to obtain a key point set;
and detecting the posture of the tea bud leaf in the disturbance state based on the key point set and the continuous space-time sequence.
2. The method for detecting the posture of tea buds and leaves under the disturbance according to claim 1, wherein the detecting the posture of the tea buds and leaves under the disturbance based on the key point set and the continuous space-time sequence comprises:
screening out target key points conforming to continuous space-time sequences from the key point set; the continuous space-time sequence is a sequence which is continuous in time and continuous in space;
connecting adjacent target key points by using a first type of connecting line, and restoring the appearance of the tea bud leaves;
connecting the same target key points of adjacent image frames in the interference video by using a second type of connecting line to construct a tea bud leaf time-space diagram; the interference video is a video in the interference video data set;
and capturing the posture change of the tea bud leaves in a disturbance state by combining the appearance of the tea bud leaves and the tea bud leaf time-space diagram.
3. The method for detecting the posture of tea buds and leaves under the disturbance state according to claim 1, wherein the step of inputting the target information of the tea buds and leaves into a target tracking model and tracking the tea buds and leaves to obtain a detection frame set comprises the steps of:
splitting the interference video in the interference video data set according to frames to obtain multi-frame disturbance images;
inputting the initial position and the category of the tea bud leaves into a target tracking model, and continuously tracking the disturbance images of a first frame number to obtain a detection frame set.
4. The method for detecting the tea leaf posture in the disturbance state according to claim 1, wherein the inputting the disturbance video data set into a target detection model for target detection to obtain tea leaf target information includes:
inputting the interference video data set into a target detection model, and representing the characteristics of each image area through position coding;
capturing feature correlations between features of different ones of the image regions based on a self-attention mechanism;
identifying a target location and a target category based on the feature and the feature relevance of each of the image areas;
drawing a detection frame according to the target position; and the detection frame and the target category form tea bud leaf target information.
5. The method for detecting the tea bud leaf gesture in the disturbance state according to claim 1, wherein the inputting the detection frame set into a key point detection model, extracting key points in the detection frame set, and obtaining the key point set includes:
inputting the detection frame set into a key point detection model, and detecting key points in each detection frame;
and forming all the key points into a key point set.
6. The method for detecting the posture of tea shoots and leaves under a disturbance state according to claim 1, wherein before inputting the disturbance video data set into a target detection model for target detection, the method further comprises:
constructing a target detection model to be trained, wherein the target detection model to be trained adopts a transducer model;
and training the target detection model to be trained by adopting a first training set to obtain the target detection model.
7. The method for detecting the posture of tea leaves in a disturbance state according to claim 1, wherein before the step of inputting the tea leaf target information into a target tracking model, the method further comprises:
constructing a target tracking model to be trained, wherein the target tracking model to be trained adopts a deep start model;
and training the target tracking model to be trained by adopting a second training set to obtain the target tracking model.
8. The method for detecting the posture of tea buds under a disturbance state according to claim 1, wherein before the step of inputting the detection box set into a key point detection model, the method further comprises:
constructing a key point detection model to be trained, wherein the key point detection model to be trained adopts an alpha Pose model;
and training the key point detection model to be trained by adopting a label data set to obtain the key point detection model.
9. The method for detecting the tea leaf posture in the disturbance state according to claim 8, wherein before the training of the key point detection model to be trained by using the tag data set, further comprising:
the label data set is manufactured, and the label data set comprises a boundary box label and a key point label; the key point labels comprise a bud top label point, a bud label point, a leaf top label point, a leaf label point, a bud-leaf cross point label point and a stalk point label point.
10. The method for detecting the tea leaf posture in the disturbance state according to claim 1, wherein the capturing the disturbance video data set includes:
disturbing the tea buds and leaves by adopting disturbance equipment;
and shooting videos of the tea buds and leaves at different distances by using a mobile camera to obtain an interference video data set.
CN202410217204.4A 2024-02-28 2024-02-28 Tea bud leaf posture detection method under disturbance state Active CN117789040B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410217204.4A CN117789040B (en) 2024-02-28 2024-02-28 Tea bud leaf posture detection method under disturbance state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410217204.4A CN117789040B (en) 2024-02-28 2024-02-28 Tea bud leaf posture detection method under disturbance state

Publications (2)

Publication Number Publication Date
CN117789040A true CN117789040A (en) 2024-03-29
CN117789040B CN117789040B (en) 2024-05-10

Family

ID=90383782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410217204.4A Active CN117789040B (en) 2024-02-28 2024-02-28 Tea bud leaf posture detection method under disturbance state

Country Status (1)

Country Link
CN (1) CN117789040B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021129064A1 (en) * 2019-12-24 2021-07-01 腾讯科技(深圳)有限公司 Posture acquisition method and device, and key point coordinate positioning model training method and device
WO2022134344A1 (en) * 2020-12-21 2022-06-30 苏州科达科技股份有限公司 Target detection method, system and device, and storage medium
WO2022142298A1 (en) * 2020-12-29 2022-07-07 北京市商汤科技开发有限公司 Key point detection method and apparatus, and electronic device and storage medium
US20230021591A1 (en) * 2021-07-26 2023-01-26 Toyota Jidosha Kabushiki Kaisha Model generation method, model generation apparatus, non-transitory storage medium, mobile object posture estimation method, and mobile object posture estimation apparatus
CN116343335A (en) * 2023-03-28 2023-06-27 沈阳理工大学 Motion gesture correction method based on motion recognition
US20230281864A1 (en) * 2022-03-04 2023-09-07 Robert Bosch Gmbh Semantic SLAM Framework for Improved Object Pose Estimation
CN117291951A (en) * 2023-10-13 2023-12-26 四川虹微技术有限公司 Multi-human-body posture tracking method based on human body key points

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021129064A1 (en) * 2019-12-24 2021-07-01 腾讯科技(深圳)有限公司 Posture acquisition method and device, and key point coordinate positioning model training method and device
WO2022134344A1 (en) * 2020-12-21 2022-06-30 苏州科达科技股份有限公司 Target detection method, system and device, and storage medium
WO2022142298A1 (en) * 2020-12-29 2022-07-07 北京市商汤科技开发有限公司 Key point detection method and apparatus, and electronic device and storage medium
US20230021591A1 (en) * 2021-07-26 2023-01-26 Toyota Jidosha Kabushiki Kaisha Model generation method, model generation apparatus, non-transitory storage medium, mobile object posture estimation method, and mobile object posture estimation apparatus
US20230281864A1 (en) * 2022-03-04 2023-09-07 Robert Bosch Gmbh Semantic SLAM Framework for Improved Object Pose Estimation
CN116343335A (en) * 2023-03-28 2023-06-27 沈阳理工大学 Motion gesture correction method based on motion recognition
CN117291951A (en) * 2023-10-13 2023-12-26 四川虹微技术有限公司 Multi-human-body posture tracking method based on human body key points

Also Published As

Publication number Publication date
CN117789040B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
Sehairi et al. Comparative study of motion detection methods for video surveillance systems
Chen et al. Human action recognition using star skeleton
US8345984B2 (en) 3D convolutional neural networks for automatic human action recognition
CN112488073A (en) Target detection method, system, device and storage medium
Purwanto et al. Dance with self-attention: A new look of conditional random fields on anomaly detection in videos
US11676182B2 (en) Computer vision systems and methods for automatically detecting, classifying, and pricing objects captured in images or videos
KR101917354B1 (en) System and Method for Multi Object Tracking based on Reliability Assessment of Learning in Mobile Environment
Chen et al. TriViews: A general framework to use 3D depth data effectively for action recognition
Liu et al. Grab: Fast and accurate sensor processing for cashier-free shopping
Afsar et al. Automatic human action recognition from video using hidden markov model
Talukdar et al. Human action recognition system using good features and multilayer perceptron network
CN113874877A (en) Neural network and classifier selection system and method
Bondalapati et al. RETRACTED ARTICLE: Moving object detection based on unified model
CN106934339B (en) Target tracking and tracking target identification feature extraction method and device
Yu et al. Tca-vad: temporal context alignment network for weakly supervised video anomly detection
CN117789040B (en) Tea bud leaf posture detection method under disturbance state
US10990859B2 (en) Method and system to allow object detection in visual images by trainable classifiers utilizing a computer-readable storage medium and processing unit
Sahbi Relevance feedback for satellite image change detection
Li et al. Videoloc: Video-based indoor localization with text information
Aarthy et al. Crowd Violence Detection in Videos Using Deep Learning Architecture
Doulamis et al. An architecture for a self configurable video supervision
US20230342820A1 (en) Computer Vision Systems and Methods for Automatically Detecting, Classifying, and Pricing Objects Captured in Images or Videos
US20240127587A1 (en) Apparatus and method for integrated anomaly detection
Jiang et al. Coding-based hough transform for pedestrian detection
US20240135547A1 (en) A data-generating procedure from raw tracking inputs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant