CN115083022A - Pet behavior identification method and device and readable storage medium - Google Patents

Pet behavior identification method and device and readable storage medium Download PDF

Info

Publication number
CN115083022A
CN115083022A CN202211006137.9A CN202211006137A CN115083022A CN 115083022 A CN115083022 A CN 115083022A CN 202211006137 A CN202211006137 A CN 202211006137A CN 115083022 A CN115083022 A CN 115083022A
Authority
CN
China
Prior art keywords
image
pet
frame
identified
skeleton
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211006137.9A
Other languages
Chinese (zh)
Inventor
吕钦
凌明
杨作兴
艾国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen MicroBT Electronics Technology Co Ltd
Original Assignee
Shenzhen MicroBT Electronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen MicroBT Electronics Technology Co Ltd filed Critical Shenzhen MicroBT Electronics Technology Co Ltd
Priority to CN202211006137.9A priority Critical patent/CN115083022A/en
Publication of CN115083022A publication Critical patent/CN115083022A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a pet behavior identification method and device and a readable storage medium. The method comprises the following steps: acquiring an image frame sequence to be detected, wherein the image frame sequence comprises a plurality of frame images of pets to be identified; carrying out pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, wherein the skeleton key point information comprises position information of skeleton key points of pets to be identified in the image; and performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized. The embodiment of the invention can overcome the influence of factors such as the color, the exposure degree and the body size of the pet to be recognized in the image frame sequence on the pet behavior recognition result, and improve the accuracy of the pet behavior recognition result.

Description

Pet behavior identification method and device and readable storage medium
Technical Field
The invention relates to the technical field of image recognition, in particular to a pet behavior recognition method and device and a readable storage medium.
Background
Along with the increase of the number of urban pets, the problems of no leash for walking dogs, no pet feeding prohibition, pet disturbance to residents and the like are caused, the intelligent management capability of cities is examined, and a new problem is brought to the construction of intelligent cities.
The existing pet AI identification algorithm has opened intelligent identification capabilities of pet variety identification, cat and dog face detection identification, pet dog face + nose print identification filing, pet picture quality detection and the like, but the technology for pet behavior identification is not mature enough, and the pet behavior identification result obtained by the existing deep learning-based method is not accurate enough.
Disclosure of Invention
The embodiment of the invention provides a pet behavior recognition method, a pet behavior recognition device and a readable storage medium, which can improve the accuracy of a pet behavior recognition result.
In a first aspect, an embodiment of the present invention discloses a pet behavior identification method, including:
acquiring an image frame sequence to be detected, wherein the image frame sequence comprises a plurality of frame images of pets to be identified;
carrying out pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, wherein the skeleton key point information comprises position information of skeleton key points of pets to be identified in the image;
and performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized.
In a second aspect, an embodiment of the present invention discloses a pet behavior recognition device, including:
the system comprises an image acquisition module, a recognition module and a recognition module, wherein the image acquisition module is used for acquiring an image frame sequence to be detected, and the image frame sequence contains a plurality of frame images of pets to be recognized;
the system comprises a key point detection module, a frame identification module and a frame identification module, wherein the key point detection module is used for carrying out pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, and the skeleton key point information comprises position information of skeleton key points of pets to be identified in the image;
and the behavior recognition module is used for performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized.
In a third aspect, embodiments of the invention disclose a machine-readable storage medium having instructions stored thereon, which when executed by one or more processors of an apparatus, cause the apparatus to perform one or more of the pet behavior recognition methods described above.
The embodiment of the invention has the following advantages:
according to the pet behavior identification method provided by the embodiment of the invention, the pet bone key point detection processing is carried out on the image frame sequence containing the pet to be identified, and the behavior identification processing is carried out on the basis of the detected bone key point information of the pet to be identified, so that the influence of factors such as the color, the exposure, the body size of the pet to be identified and the like of the image in the image frame sequence on the pet behavior identification result can be overcome, and the accuracy of the pet behavior identification result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a flow chart illustrating steps of an embodiment of a pet behavior recognition method of the present invention;
FIG. 2 is an architecture diagram of an application scenario of a pet behavior recognition method according to the present invention;
FIG. 3 is a flow chart of a pet behavior recognition method according to the present invention;
fig. 4 is a block diagram illustrating a pet behavior recognition device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms first, second and the like in the description and in the claims of the present invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the invention may be practiced other than those illustrated or described herein, and that the words "first", "second", etc. do not necessarily distinguish one element from another, but rather denote any number of elements, e.g., a first element may be one or more than one. Furthermore, the term "and/or" in the specification and claims is used to describe an association relationship of associated objects, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The term "plurality" in the embodiments of the present invention means two or more, and other terms are similar thereto.
Referring to fig. 1, a flow chart illustrating steps of an embodiment of a pet behavior recognition method of the present invention may include the steps of:
step 101, obtaining an image frame sequence to be detected, wherein the image frame sequence comprises a plurality of frame images of pets to be identified.
102, carrying out pet bone key point detection processing on the image frame sequence to obtain bone key point information corresponding to each frame of image, wherein the bone key point information comprises position information of bone key points of the pet to be identified in the image.
103, performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized.
The pet behavior recognition method provided by the embodiment of the invention can perform behavior recognition processing on the pet to be recognized based on the skeleton key point information of the pet to be recognized in the image to obtain a behavior recognition result, thereby improving the accuracy of pet behavior recognition.
Referring to fig. 2, a diagram illustrating an application scenario architecture of a pet behavior recognition method according to an embodiment of the present invention is shown. As shown in fig. 2, the scenario provided in the embodiment of the present invention may include a terminal device 201 and a server 202. The terminal device 201 and the server 202 are connected through a wireless or wired network. The terminal device 201 may include, but is not limited to, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, and the like. The server 202 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, cloud communication, a Network service, a middleware service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The terminal device 201 and the server 202 may each independently execute the pet behavior recognition method provided by the embodiment of the present invention, or may cooperatively execute the pet behavior recognition method provided by the embodiment of the present invention.
The image frame sequence to be detected is an image frame sequence which is acquired from a monitoring video and contains a pet to be identified. The monitoring video may be a video material collected by an electronic device having a video collecting function, the electronic device may be a video camera, a monitoring camera, or the like, or may be a terminal device 201 having a video collecting function.
In a possible application scenario, the terminal device 201 acquires an image frame sequence to be detected through the monitoring camera, and sends the image frame sequence to be detected to the server 202. The server 202 performs pet skeleton key point detection processing on the image frame sequence sent by the terminal device 201 to obtain skeleton key point information corresponding to each frame of image, then performs behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized, and sends the behavior recognition result to the terminal device 201. The user can inquire the behavior recognition result for the pet to be recognized through the terminal device 201.
In another possible application scenario, the terminal device 201 acquires an image frame sequence to be detected from the monitoring camera, performs pet bone key point detection processing on the acquired image frame sequence to obtain bone key point information corresponding to each frame of image, performs behavior recognition processing according to the bone key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized, and displays the behavior recognition result in a terminal display interface for a user to view.
It should be noted that the architecture diagram in the embodiment of the present invention is used as an example to more clearly illustrate the technical solution in the embodiment of the present invention, and does not limit the technical solution provided in the embodiment of the present invention, and for other application scenario architectures and service applications, the technical method provided in the embodiment of the present invention is also applicable to similar problems.
According to the embodiment of the invention, the skeleton key point information of the pet to be identified in the image can be extracted by carrying out pet skeleton key point detection processing on the image frame sequence of the pet to be identified, wherein the skeleton key point information is used for describing the posture of the pet to be identified. The skeleton key points of the pet to be identified comprise limb key points, trunk key points and head key points; the limb key points comprise 4 paw key points, 4 elbow key points and 4 knee key points; the torso key points include: a buttocks keypoint and a neck keypoint; the head key points include left and right eye key points and a nose key point. Of course, the skeletal key points are different for different classes of pets to be identified. It should be noted that the skeletal key points listed in the embodiments of the present invention are only an exemplary illustration of the present invention, and do not constitute a limitation to the present invention, and in practical applications, the skeletal key points of the pet to be identified may be determined according to practical requirements.
Illustratively, the image frame sequence may be subjected to pet skeletal keypoint detection processing by a pre-trained skeletal keypoint detection model, which may be any deep learning model skilled in the art for skeletal keypoint detection, e.g., a deep learning model based on the openpos algorithm, the highherhrnet algorithm, etc. In practical application, a special training sample set can be constructed for different types of pets to be identified, and then iterative training is performed on the skeletal key point detection model based on the training sample set, wherein the pets to be identified can be cats, dogs and the like.
The detected skeleton key point information can comprise position information of all skeleton key points of the pet to be identified, the position information of the skeleton key points can reflect relative positions of all joints of the pet to be identified, and the posture of the pet to be identified can be determined by assisting the connection relation among all the skeleton key points according to the position information of all the skeleton key points of the same pet to be identified. It will be appreciated that for pets of the same category, the connection relationships between the skeletal key points are determined, for example, for cats, dogs, the connection relationships between the skeletal key points can be expressed as: the left front paw, the left front knee, the left front elbow, the right front paw, the right front knee, the right front elbow, the left rear paw, the left rear knee, the left rear elbow, the right rear paw, the right rear knee, the right rear elbow, the left front elbow and the right front elbow are respectively connected with a neck, the left rear elbow and the right rear elbow are respectively connected with a hip, the hip is connected with the neck, the neck is connected with a central point of an area formed by key points of each head, the left eye and the right eye of each key point of the head are connected, and the left eye and the right eye of each key point of the head are respectively connected with a mouth.
And performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized. It can be understood that the behavior recognition result may be a behavior category of the pet to be recognized, and the terminal device may generate different prompt information according to different behavior categories. For example, when detecting that the behavior of the pet to be identified belongs to excretion behavior, the terminal device may generate prompt information of "please clean excrement", and output the occurrence position of the excretion behavior of the pet, so that the user can clean in time; or, when detecting that the behavior of the pet to be identified belongs to destructive behavior, such as biting a sofa, the terminal device may generate a prompt message "please stop the pet from detaching home", and so on. The prompt message may be a text message and/or a voice message, which is not specifically limited in this embodiment of the present invention. In addition, the behavior recognition result can also be the motion state of the pet to be recognized, such as the motion speed, the activity degree and the like, and the behavior recognition result can be further analyzed to judge the health condition of the pet to be recognized. For example, when it is detected that the pet to be identified does not move within a period of time, or the pet to be identified moves at a speed much higher than the historical moving speed in the same scene, it may be considered that the body of the pet to be identified is abnormal, and the terminal device may generate corresponding prompt information to remind the user of the health condition concerned.
According to the pet behavior identification method provided by the embodiment of the invention, the pet bone key point detection processing is carried out on the image frame sequence containing the pet to be identified, and the behavior identification processing is carried out on the basis of the detected bone key point information of the pet to be identified, so that the influence of factors such as the color, the exposure, the body size of the pet to be identified and the like of the image in the image frame sequence on the pet behavior identification result can be overcome, and the accuracy of the pet behavior identification result is improved.
It should be noted that, in the embodiment of the present invention, the pet behavior identification may be performed in two ways, namely "bottom-up" and "top-down".
As an example, when the pet behavior is identified in a "bottom-up" manner, all pet skeleton key points in the image may be identified first, then each skeleton key point is divided into corresponding pets to be identified, and the behavior identification processing is performed on the skeleton key point of each pet to be identified. In an optional embodiment of the present invention, the bone key point information further includes a time series identifier of the bone key point of the pet to be identified, and the time series identifier is used for marking an arrangement order of the image frame to which the bone key point belongs in the image frame sequence; 103, performing behavior recognition processing according to the skeletal key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized, including:
step S11, determining the connection relation among all skeleton key points belonging to the same pet to be identified according to the position information of the skeleton key points of the pet to be identified in the image of each frame of image in the image frame sequence;
and step S12, performing behavior recognition processing according to the connection relation among all skeleton key points belonging to the same pet to be recognized in each frame of image and the time sequence identification of all skeleton key points in each frame of image, and obtaining a behavior recognition result of the pet to be recognized.
In the embodiment of the invention, when the pet behavior is identified in a bottom-up manner, the skeletal key points of all pets contained in each frame of image in the image frame sequence are obtained through the detection processing of the skeletal key points of the pets in step 102. Before behavior recognition processing, connection relations among all skeleton key points belonging to the same pet to be recognized need to be determined for each frame of image in an image frame sequence according to position information of the skeleton key points of the pet to be recognized in the image.
Illustratively, when only one pet to be identified exists in the image, the relative positions of the skeleton key points are determined directly according to the position information of the skeleton key points, so that the limb part of the pet identified by each skeleton key point is determined, and the connection relation between the skeleton key points is determined according to the limb part of the pet identified by each skeleton key point. If two or more pets to be identified exist in the image, the skeleton key points contained in each pet to be identified need to be divided according to the position information of each skeleton key point, generally, the distance between the skeleton key points of the same pet to be identified is smaller than a preset distance threshold value, and the distance threshold value can be dynamically set according to the scene information of the image to be detected. In addition, only one bone key point exists in the same pet to be identified aiming at the same pet limb part. After the skeleton key points contained in each pet to be identified are divided, the connection relation among the skeleton key points belonging to the same pet to be identified is further determined.
And then, performing behavior recognition processing according to the connection relation among all skeleton key points belonging to the same pet to be recognized in each frame of image and the time sequence identification of all skeleton key points in each frame of image to obtain a behavior recognition result of the pet to be recognized. The connection relation among all skeleton key points belonging to the same pet to be identified is used for determining the posture of the pet to be identified in each frame of image; the time sequence identification of each bone key point in each frame image is used for marking the arrangement sequence of the image frames to which the bone key points belong in the image frame sequence, and can indicate the identification sequence of each bone key point.
It can be understood that in a continuous image frame sequence, postures or actions of the pet to be recognized in each frame of image are continuous, the postures or actions presented in each frame of image have a sequence, and the behavior recognition processing is performed according to the time sequence identification of each bone key point, so that the change condition of the postures or actions of the pet to be recognized can be fully considered, and the accuracy of the pet behavior recognition result can be improved.
In practical application, the skeleton key point information of the pet to be identified can be processed through the behavior identification model, and the behavior identification result of the pet to be identified is obtained. Optionally, in step S12, performing behavior recognition processing according to the connection relationship between the skeleton key points in each frame of image and the time series identifier of each skeleton key point in each frame of image, to obtain a behavior recognition result of the pet to be recognized, includes:
the substep S121, sequentially inputting the connection relation among all skeleton key points belonging to the same pet to be identified in each frame of image and the time sequence identification of all skeleton key points in each frame of image into a skeleton modeling layer of a first behavior identification model for dynamic skeleton modeling processing according to the arrangement sequence of the images in the image frame sequence to obtain a dynamic skeleton model of the pet to be identified;
and a substep S122, inputting the dynamic skeleton model into a convolution layer of the first behavior recognition model for space-time convolution processing to obtain a behavior recognition result of the pet to be recognized.
The dynamic skeleton model is used for reflecting the connection relation among all skeleton key points of the pet to be identified in the image frame sequence and the cross-continuous time connection of the same skeleton key point in different images. It should be noted that, in the form of 2D or 3D coordinates, the dynamic skeleton model may be represented by a time series of position coordinates of the skeleton key points of the pet to be identified; or, the pet identification method can also be represented by a graph structure constructed according to the coordinate information of the bone key points of the pet to be identified, wherein the graph structure comprises a space graph structure used for representing the connection relation among all the bone key points in the same frame image and a time graph structure used for representing the cross-continuous time connection of the same bone key point in different images. By analyzing the action mode of the dynamic skeleton model, the behavior category of the pet to be identified can be obtained.
The first behavior recognition model may be any deep learning model capable of performing pet behavior recognition based on skeletal keypoint information of a pet in the art, for example, the first behavior recognition model may be a deep learning model based on an ST-GCN algorithm or an ST-GCN + +. The first behavior recognition model can be divided into a skeleton modeling layer and a convolution layer in terms of functions, wherein the skeleton modeling layer is used for connecting natural connections among all skeleton key points and cross-continuous time connections of the same skeleton key point by taking the skeleton key points of the pet to be recognized as nodes, so that the dynamic skeleton model of the pet to be recognized is obtained. The convolutional layer is used for performing space-time convolution processing on the dynamic bone model, and may be specifically divided into space graph convolution processing and time graph convolution processing, where the space graph convolution processing is used for aggregating information of adjacent bone key points, for example, the bone key points A, B, C are adjacent, a specific connection relationship may be represented as a-B-C, after the space graph convolution processing is performed, information of a and B is aggregated in a convolution result corresponding to the bone key point a, and information of B and C is aggregated in a convolution result corresponding to the bone key point B. The time map convolution processing is used for aggregating information of the same bone key point in a continuous image frame sequence, for example, images a, b and c are sequentially included in the image frame sequence, after the time map convolution processing is performed, information of a and b is aggregated in a convolution result of the image a, and information of b and c is aggregated in a convolution result corresponding to the image b. And performing space-time convolution processing on the dynamic skeleton model of the pet to be recognized to obtain a behavior recognition result of the pet to be recognized.
According to the embodiment of the invention, the bone key point information of the pet to be identified detected in the image frame sequence is processed through the first behavior identification model, natural connection among all bone key points and cross-continuous time connection of the same bone key point can be fully considered, and the accuracy of the pet behavior identification result is improved.
In addition, as another example, the embodiment of the invention also provides a pet behavior recognition mode from top to bottom. In an optional embodiment of the present invention, before performing pet bone keypoint detection processing on the image frame sequence to obtain bone keypoint information corresponding to each frame of image in step 102, the method further includes:
step S21, carrying out target detection processing on each frame of image in the image frame sequence to obtain detection frames corresponding to each frame of image, wherein the detection frames are used for framing out the areas of the pets to be identified in the images, and one detection frame contains one pet to be identified;
step 102, performing pet bone key point detection processing on the image frame sequence to obtain bone key point information corresponding to each frame of image, including:
and step S22, respectively carrying out pet bone key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence to obtain bone key point information corresponding to each pet to be identified contained in the image.
In the embodiment of the invention, when the pet behavior recognition is performed in a top-down manner, all areas of pets to be recognized can be detected in each frame of image of the image frame sequence, then the pet bone key point detection processing is performed on each area of the pets to be recognized respectively to obtain the bone key point information corresponding to each pet to be recognized, and finally the behavior recognition result of the pets to be recognized is obtained by performing the behavior recognition processing according to the bone key point information corresponding to each pet to be recognized.
Specifically, each frame of image in the image frame sequence may be subjected to target detection processing to obtain a detection frame corresponding to each frame of image. It should be noted that the detection frames are used for framing the area of the pet to be identified in the image, and one detection frame contains one pet to be identified. In practical applications, the image frame sequence may be subjected to an object detection process by using an object detection model, which may be a deep learning model that is good at object detection in the art, for example, the object detection model may be a deep learning model that includes, but is not limited to, an object detection algorithm such as fast RCNN, YOLO, and the like.
Optionally, in step S21, performing target detection processing on each frame of image in the image frame sequence to obtain a detection frame corresponding to each frame of image, includes: performing joint processing on each frame of image in the image frame sequence based on a target detection model and a tracking model to obtain a detection frame corresponding to each frame of image; the target detection model is used for carrying out target detection processing on the image, and the tracking model is used for correcting deviation occurring in the process of carrying out target detection processing on the target detection model.
In the process of performing the target detection processing on the image frame sequence, the detection frames of some image frames may be lost due to detection jitter and the like, and in order to improve the accuracy of target detection, the detection result of the target detection model may be compensated by the tracking model. The tracking algorithm can correlate the same target in different frame images, and when the detection result of the target detection model has deviation, the tracking model can correct the deviation. When the target is occluded in a certain frame or is lost temporarily, the tracking model can also predict the position, size and other information of the target. The tracking model may include, but is not limited to, a deep learning model that employs a sort algorithm, a depsort algorithm, or the like.
After the detection frames corresponding to each frame of image frame sequence are obtained, pet bone key point detection processing is respectively carried out on each detection frame, and then bone key point information corresponding to each pet to be identified can be obtained. Illustratively, the detection frame may be subjected to pet skeletal keypoint detection processing using a skeletal keypoint detection model, which may include, but is not limited to, a deep learning model using Mask R-CNN, G-RMI, or like algorithms.
When the pet bone key point detection processing is carried out, the image frame sequence is firstly subjected to the target detection processing, the position and the size of the detection frame of the pet to be identified are predicted, then the pet bone key point detection processing is carried out on each detection frame, the bone key point information of each pet to be identified is predicted, the prediction precision of the bone key point information can be improved, and the accuracy of the pet behavior identification result is favorably improved.
Optionally, in step S22, performing pet skeleton keypoint detection processing on each detection frame included in the image, respectively, for each frame of image in the image frame sequence, to obtain skeleton keypoint information corresponding to each pet to be identified included in the image, includes:
the substep S221, aiming at each frame of image in the image frame sequence, respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image, and determining the number of detected skeleton key points of each pet to be identified;
step S222, if the number of the skeleton key points of the pet to be identified in the first image is greater than or equal to a preset point threshold value, determining skeleton key point information corresponding to the pet to be identified according to the position of each skeleton key point of the pet to be identified in the first image; the first image is any one frame image in the image frame sequence;
substep S223, if the number of skeletal key points corresponding to the pet to be identified in the first image is less than a preset point threshold, determining that the detection of the skeletal key points of the pet to be identified in the first image fails, and deleting the first image from the image frame sequence;
and a substep S224, if the number of image frames contained in the image frame sequence is less than a preset frame number threshold, determining that the detection of the pet bone key points aiming at the image frame sequence fails, and stopping the behavior recognition processing of the pet to be recognized.
When the detection frame of the pet to be identified is subjected to pet key point detection processing, the pet to be identified in the image may be shielded by other objects, such as a table, a sofa and the like, or is shielded by people or other pets, so that the number of skeleton key points detected by the pet to be identified is less than a preset point threshold value, for example, assuming normal, a total of 17 skeletal keypoints for the pet to be identified, and the number of the bone key points corresponding to the pet to be identified in the first image is less than 10, in this case, the posture of the pet to be identified cannot be accurately predicted based on the bone key point information of the pet to be identified in the first image, it may thus be determined that the pet skeletal keypoint detection for the pet to be identified in the first image failed, and deleting the first image from the image frame sequence, and performing behavior recognition processing on the pet to be recognized according to the bone key point information contained in other images in the image frame sequence.
If a plurality of images with the number of skeletal key points smaller than the preset point threshold exist in the image frame sequence, namely after the first image with the number of skeletal key points smaller than the preset point threshold is deleted from the image frame sequence, the number of image frames contained in the image frame sequence is smaller than the preset frame number threshold, which indicates that the pet to be identified is in a shielded state for a long time, the information of the skeletal key points of the pet to be identified, which is actually detected, is less, and the data requirement of behavior identification processing cannot be met.
In an optional embodiment of the present invention, the performing, in step 103, behavior recognition processing according to the bone key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized includes:
step S31, aiming at each frame of image in the image frame sequence, constructing a pet skeleton map of each pet to be identified according to the position information of the skeleton key points of each pet to be identified contained in the image, wherein the pet skeleton map comprises all the skeleton key points of the pet to be identified, and two adjacent skeleton key points are connected through a straight line;
and step S32, sequentially inputting the pet skeleton diagram of the pet to be identified in each frame of image into a second behavior identification model for behavior identification processing according to the arrangement sequence of the images in the image frame sequence, and obtaining the behavior identification result of the pet to be identified.
In the embodiment of the invention, in order to further improve the accuracy of the pet behavior recognition result, the pet skeleton map of the pet to be recognized can be constructed according to the position information of the bone key points of the pet to be recognized after the bone key point information of the pet to be recognized is obtained. The pet skeleton map comprises all skeleton key points of a pet to be identified, and two adjacent skeleton key points are connected through a straight line.
Referring to fig. 3, a flow chart of a pet behavior recognition method according to an embodiment of the present invention is shown. As shown in fig. 3, after performing target detection processing on a certain frame of image in the image frame sequence to be detected, a detection frame of the pet to be identified is obtained. And then, carrying out pet bone key point detection processing on the detection frame to obtain the bone key points of the pet to be identified. And constructing a skeleton according to the position information of each skeleton key point to obtain a pet skeleton map of the pet to be identified. When the pet skeleton map is constructed, the relative positions of the skeleton key points can be determined according to the position information of the skeleton key points, the pet limb part identified by each skeleton key point is further determined, the connection relation between the skeleton key points is determined according to the pet limb part identified by each skeleton key point, and the skeleton key points with the connection relation are sequentially connected through straight lines, so that the pet skeleton map can be obtained. Illustratively, when a skeleton is constructed according to the position information of each skeleton key point to obtain a pet skeleton map of the pet to be identified, a gaussian circle corresponding to each skeleton key point can be generated according to the position information of each skeleton key point, and one gaussian circle represents one skeleton key point; then, a straight line with a certain width is adopted, and Gaussian circles corresponding to all skeleton key points are sequentially connected according to the connection relation among all the skeleton key points, so that a pet skeleton map of the pet to be identified is obtained. Wherein, the width of the straight line used for connecting the Gaussian circles can be set according to actual requirements.
It should be noted that fig. 3 only shows the skeletal key points and the pet skeleton map of the pet to be identified corresponding to two frames of images in the image frame sequence, and in the actual processing process, for each pet to be identified in each frame of image in the image frame sequence, corresponding skeletal key point information and pet skeleton map are generated.
And finally, sequentially inputting the pet skeleton diagram of the pet to be identified in each frame of image into a second behavior identification model for behavior identification processing according to the arrangement sequence of the images in the image frame sequence, and obtaining a behavior identification result of the pet to be identified. The second behavior recognition model may be any deep learning model that is good in behavior recognition in the art, for example, a deep learning model that uses the algorithm of PosC 3D.
It should be noted that various deep learning models adopted in the embodiment of the present invention can be obtained by pre-training. Optionally, the method further comprises:
step S41, constructing a training sample set aiming at the pet to be identified, wherein the training sample set comprises an image sample of the pet to be identified and annotation information corresponding to the image sample;
step S42, performing iterative training on the deep learning model according to the training sample set to obtain a processing result corresponding to the image sample;
step S43, calculating a loss value of the deep learning model according to the processing result and the labeling information, and adjusting model parameters of the deep learning model according to the loss value until a preset termination condition is met to obtain a deep learning model after training;
the deep learning model comprises any one of a first behavior recognition model, a second behavior recognition model, a target detection model and a tracking model.
The annotation information is used for reflecting a real result corresponding to the image sample. It will be appreciated that the annotation information is different for different deep learning models. For example, when the deep learning model is a first behavior recognition model or a second behavior recognition model, the labeling information is used for labeling the real behavior category of the pet to be recognized; and when the deep learning model is the target detection model, the marking information is used for marking the real position and size of the pet to be identified in the image sample.
The preset termination condition may be set according to actual requirements, for example, the preset termination condition may be that a loss value of the deep learning model is smaller than a preset threshold, or that an error between loss values obtained in multiple rounds of training is smaller than a preset value. The loss value can be selected according to the model structure of the deep learning model, for example, the loss value can be the cross entropy of the processing result of the deep learning model and the annotation information, and the like.
According to the embodiment of the invention, a special training sample set can be constructed for different types of pets to be identified, and the training sample set is utilized to carry out iterative training on various deep learning models used in the pet behavior identification process, so that the deep learning models specially used for processing the pets to be identified are obtained, the model processing process is more in line with the characteristics of the pets to be identified, and the accuracy of the pet behavior identification result is improved.
In summary, the pet behavior recognition method provided by the embodiment of the present invention detects the bone key points of the pet in the image frame sequence including the pet to be recognized, and performs the behavior recognition processing based on the detected bone key point information of the pet to be recognized, so as to overcome the influence of the color, exposure, size of the pet to be recognized, and other factors of the image in the image frame sequence on the pet behavior recognition result, and improve the accuracy of the pet behavior recognition result.
It should be noted that for simplicity of description, the method embodiments are shown as a series of combinations of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of a pet behavior recognition device according to an embodiment of the present invention is shown, and the device may include:
the image acquisition module 401 is configured to acquire an image frame sequence to be detected, where the image frame sequence includes a plurality of frame images of a pet to be identified;
a key point detection module 402, configured to perform pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, where the skeleton key point information includes position information of a skeleton key point of a pet to be identified in the image;
and the behavior identification module 403 is configured to perform behavior identification processing according to the bone key point information corresponding to each frame of image in the image frame sequence, so as to obtain a behavior identification result of the pet to be identified.
Optionally, the skeleton key point information further includes a time sequence identifier of the skeleton key points of the pet to be identified, and the time sequence identifier is used for marking the arrangement order of the image frames to which the skeleton key points belong in the image frame sequence; the behavior recognition module comprises:
the connection relation determining submodule is used for determining the connection relation among all skeleton key points belonging to the same pet to be identified according to the position information of the skeleton key points of the pet to be identified in the image of each frame of image in the image frame sequence;
and the first behavior recognition submodule is used for performing behavior recognition processing according to the connection relation among all skeleton key points belonging to the same pet to be recognized in each frame of image and the time sequence identification of all the skeleton key points in each frame of image to obtain a behavior recognition result of the pet to be recognized.
Optionally, the first behavior identification submodule includes:
the skeleton modeling unit is used for inputting the connection relation among all skeleton key points belonging to the same pet to be identified in each frame of image and the time sequence identification of all skeleton key points in each frame of image into a skeleton modeling layer of a first behavior identification model in sequence according to the arrangement sequence of the images in the image frame sequence to perform dynamic skeleton modeling processing, so as to obtain a dynamic skeleton model of the pet to be identified, wherein the dynamic skeleton model is used for reflecting the connection relation among all skeleton key points of the pet to be identified in the image frame sequence and the cross-continuous time connection of the same skeleton key point in different images;
and the behavior identification unit is used for inputting the dynamic skeleton model into the convolution layer of the first behavior identification model to perform space-time convolution processing so as to obtain a behavior identification result of the pet to be identified.
Optionally, the apparatus further comprises:
the target detection module is used for carrying out target detection processing on each frame of image in the image frame sequence to obtain a detection frame corresponding to each frame of image, the detection frame is used for framing out the area of the pet to be identified in the image, and one detection frame contains the pet to be identified;
the key point detection module comprises:
and the key point detection submodule is used for respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence to obtain skeleton key point information corresponding to each pet to be identified contained in the image.
Optionally, the behavior recognition module includes:
the pet skeleton map construction sub-module is used for constructing a pet skeleton map of each pet to be identified according to the position information of the skeleton key points of each pet to be identified, wherein the position information of the skeleton key points of each pet to be identified is contained in each frame of image frame sequence, the pet skeleton map comprises all the skeleton key points of the pet to be identified, and two adjacent skeleton key points are connected through a straight line;
and the second behavior recognition submodule is used for sequentially inputting the pet skeleton diagram of the pet to be recognized in each frame of image into a second behavior recognition model according to the arrangement sequence of the images in the image frame sequence to perform behavior recognition processing so as to obtain a behavior recognition result of the pet to be recognized.
Optionally, the target detection module includes:
the target detection submodule is used for carrying out joint processing on each frame of image in the image frame sequence based on a target detection model and a tracking model to obtain a detection frame corresponding to each frame of image; the target detection model is used for carrying out target detection processing on the image, and the tracking model is used for correcting deviation occurring in the process of carrying out target detection processing on the target detection model.
Optionally, the keypoint detection sub-module includes:
the key point detection unit is used for respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence and determining the number of detected skeleton key points of each pet to be identified;
the information determining unit is used for determining the skeleton key point information corresponding to the pet to be identified according to the position of each skeleton key point of the pet to be identified in the first image if the number of the skeleton key points of the pet to be identified in the first image is greater than or equal to a preset point threshold value; the first image is any one frame image in the image frame sequence;
and the image deleting unit is used for determining that the detection of the pet bone key points of the pet to be identified in the first image fails if the number of the bone key points corresponding to the pet to be identified in the first image is less than a preset point threshold value, and deleting the first image from the image frame sequence.
Optionally, the keypoint detection sub-module further includes:
and the identification stopping unit is used for determining that the detection of the pet bone key points aiming at the image frame sequence fails and stopping the behavior identification processing of the pet to be identified if the number of the image frames contained in the image frame sequence is less than a preset frame number threshold value.
Optionally, the skeleton key points of the pet to be identified comprise limb key points, trunk key points and head key points; the limb key points comprise 4 paw key points, 4 elbow key points and 4 knee key points; the torso key points include: buttock key points and neck key points; the head key points include left and right eye key points and a nose key point.
Optionally, the apparatus further comprises:
the system comprises a sample set construction module, a detection module and a recognition module, wherein the sample set construction module is used for constructing a training sample set aiming at a pet to be recognized, and the training sample set comprises an image sample of the pet to be recognized and marking information corresponding to the image sample;
the iterative training module is used for carrying out iterative training on the deep learning model according to the training sample set to obtain a processing result corresponding to the image sample;
the parameter adjusting module is used for calculating a loss value of the deep learning model according to the processing result and the labeling information, and adjusting model parameters of the deep learning model according to the loss value until a preset termination condition is met to obtain a trained deep learning model;
the deep learning model comprises any one of a first behavior recognition model, a second behavior recognition model, a target detection model and a tracking model.
In summary, the pet behavior recognition device provided in the embodiment of the present invention performs pet skeleton key point detection processing on the image frame sequence including the pet to be recognized, and performs behavior recognition processing based on the detected skeleton key point information of the pet to be recognized, so as to overcome the influence of the color, exposure, size of the pet to be recognized, and other factors of the image in the image frame sequence on the pet behavior recognition result, and improve the accuracy of the pet behavior recognition result.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
An embodiment of the present invention further provides a non-transitory computer-readable storage medium, where when an instruction in the storage medium is executed by a processor of a device (server or terminal), the device is enabled to perform the description of the pet behavior identification method in the embodiment corresponding to fig. 1, and therefore, the description will not be repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer program product or the computer program referred to in the present application, reference is made to the description of the embodiments of the method of the present application.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
The pet behavior recognition method, the pet behavior recognition device and the readable storage medium provided by the invention are described in detail, and specific examples are applied in the description to explain the principles and the implementation of the invention, and the description of the examples is only used to help understand the method and the core ideas of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (21)

1. A pet behavior recognition method, characterized in that the method comprises:
acquiring an image frame sequence to be detected, wherein the image frame sequence comprises a plurality of frame images of pets to be identified;
carrying out pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, wherein the skeleton key point information comprises position information of skeleton key points of pets to be identified in the image;
and performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized.
2. The method according to claim 1, wherein the skeletal key point information further comprises a time series identifier of skeletal key points of the pet to be identified, and the time series identifier is used for marking the arrangement order of the image frames to which the skeletal key points belong in the image frame sequence; the behavior recognition processing is carried out according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain the behavior recognition result of the pet to be recognized, and the behavior recognition result comprises the following steps:
determining the connection relation among all skeleton key points belonging to the same pet to be identified according to the position information of the skeleton key points of the pet to be identified in the image of each frame of image in the image frame sequence;
and performing behavior recognition processing according to the connection relation among all the skeleton key points belonging to the same pet to be recognized in each frame of image and the time sequence identification of all the skeleton key points in each frame of image to obtain a behavior recognition result of the pet to be recognized.
3. The method according to claim 2, wherein the obtaining of the behavior recognition result of the pet to be recognized by performing behavior recognition processing according to the connection relationship among the skeletal key points belonging to the same pet to be recognized in each frame of image and the time series identifier of each skeletal key point in each frame of image comprises:
according to the arrangement sequence of the images in the image frame sequence, inputting the connection relationship among all skeleton key points belonging to the same pet to be identified in each frame of image and the time sequence identification of all skeleton key points in each frame of image into a skeleton modeling layer of a first behavior identification model to perform dynamic skeleton modeling processing, so as to obtain a dynamic skeleton model of the pet to be identified, wherein the dynamic skeleton model is used for reflecting the connection relationship among all skeleton key points of the pet to be identified in the image frame sequence and the cross-continuous time connection of the same skeleton key point in different images;
and inputting the dynamic skeleton model into a convolution layer of the first behavior recognition model for space-time convolution processing to obtain a behavior recognition result of the pet to be recognized.
4. The method of claim 1, wherein before the pet skeletal keypoint detection processing is performed on the image frame sequence to obtain skeletal keypoint information corresponding to each frame of image, the method further comprises:
carrying out target detection processing on each frame of image in the image frame sequence to obtain a detection frame corresponding to each frame of image, wherein the detection frame is used for framing out the area of the pet to be identified in the image, and one detection frame contains one pet to be identified;
the detecting process of the pet bone key points is carried out on the image frame sequence to obtain the bone key point information corresponding to each frame of image, and the detecting process comprises the following steps:
and respectively carrying out pet bone key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence to obtain bone key point information corresponding to each pet to be identified contained in the image.
5. The method according to claim 4, wherein the performing behavior recognition processing according to the skeletal key point information corresponding to each frame of image in the image frame sequence to obtain the behavior recognition result of the pet to be recognized comprises:
aiming at each frame of image in the image frame sequence, constructing a pet skeleton map of each pet to be identified according to the position information of the skeleton key points of each pet to be identified contained in the image, wherein the pet skeleton map comprises each skeleton key point of the pet to be identified, and two adjacent skeleton key points are connected through a straight line;
and sequentially inputting the pet skeleton diagram of the pet to be identified in each frame of image into a second behavior identification model for behavior identification processing according to the arrangement sequence of the images in the image frame sequence, so as to obtain a behavior identification result of the pet to be identified.
6. The method according to claim 4, wherein the performing the target detection processing on each frame of image in the image frame sequence to obtain a detection frame corresponding to each frame of image comprises:
performing joint processing on each frame of image in the image frame sequence based on a target detection model and a tracking model to obtain a detection frame corresponding to each frame of image; the target detection model is used for carrying out target detection processing on the image, and the tracking model is used for correcting deviation occurring in the process of carrying out target detection processing on the target detection model.
7. The method according to claim 4, wherein the step of performing pet bone keypoint detection processing on each detection frame included in the image respectively for each frame of image in the image frame sequence to obtain bone keypoint information corresponding to each pet to be identified included in the image comprises:
respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence, and determining the number of detected skeleton key points of each pet to be identified;
if the number of the skeleton key points of the pet to be identified in the first image is greater than or equal to a preset point threshold value, determining skeleton key point information corresponding to the pet to be identified according to the position of each skeleton key point of the pet to be identified in the first image; the first image is any one frame image in the image frame sequence;
if the number of skeleton key points corresponding to the pet to be identified in the first image is smaller than a preset point threshold value, determining that the detection of the pet skeleton key points of the pet to be identified in the first image fails, and deleting the first image from the image frame sequence.
8. The method of claim 7, further comprising:
and if the number of image frames contained in the image frame sequence is less than a preset frame number threshold, determining that the detection of the pet skeleton key points aiming at the image frame sequence fails, and stopping the behavior recognition processing of the pet to be recognized.
9. The method according to any one of claims 1 to 8, wherein the skeleton key points of the pet to be identified comprise limb key points, trunk key points and head key points; the limb key points comprise 4 paw key points, 4 elbow key points and 4 knee key points; the torso key points include: buttock key points and neck key points; the head key points include left and right eye key points and nose key points.
10. The method of claim 1, further comprising:
constructing a training sample set aiming at a pet to be identified, wherein the training sample set comprises an image sample of the pet to be identified and marking information corresponding to the image sample;
performing iterative training on a deep learning model according to the training sample set to obtain a processing result corresponding to the image sample;
calculating a loss value of the deep learning model according to the processing result and the labeling information, and adjusting model parameters of the deep learning model according to the loss value until a preset termination condition is met to obtain a trained deep learning model;
the deep learning model comprises any one of a first behavior recognition model, a second behavior recognition model, a target detection model and a tracking model.
11. A pet behavior recognition device, the device comprising:
the system comprises an image acquisition module, a recognition module and a recognition module, wherein the image acquisition module is used for acquiring an image frame sequence to be detected, and the image frame sequence contains a plurality of frame images of pets to be recognized;
the system comprises a key point detection module, a key point detection module and a key point recognition module, wherein the key point detection module is used for carrying out pet skeleton key point detection processing on the image frame sequence to obtain skeleton key point information corresponding to each frame of image, and the skeleton key point information comprises position information of skeleton key points of pets to be recognized in the image;
and the behavior recognition module is used for performing behavior recognition processing according to the skeleton key point information corresponding to each frame of image in the image frame sequence to obtain a behavior recognition result of the pet to be recognized.
12. The apparatus of claim 11, wherein the skeletal key point information further comprises a time series identifier of the skeletal key point of the pet to be identified, and the time series identifier is used for marking the arrangement order of the image frames to which the skeletal key point belongs in the image frame sequence; the behavior recognition module comprises:
the connection relation determining sub-module is used for determining the connection relation among all the skeleton key points belonging to the same pet to be identified according to the position information of the skeleton key points of the pet to be identified in the image of each frame of image in the image frame sequence;
and the first behavior recognition submodule is used for performing behavior recognition processing according to the connection relation among all skeleton key points belonging to the same pet to be recognized in each frame of image and the time sequence identification of all the skeleton key points in each frame of image to obtain a behavior recognition result of the pet to be recognized.
13. The apparatus of claim 12, wherein the first behavior identification submodule comprises:
the skeleton modeling unit is used for inputting the connection relation among all skeleton key points belonging to the same pet to be identified in each frame of image and the time sequence identification of all skeleton key points in each frame of image into a skeleton modeling layer of the first behavior identification model in sequence according to the arrangement sequence of the images in the image frame sequence to perform dynamic skeleton modeling processing so as to obtain a dynamic skeleton model of the pet to be identified, wherein the dynamic skeleton model is used for reflecting the connection relation among all skeleton key points of the pet to be identified in the image frame sequence and the cross-continuous time connection of the same skeleton key point in different images;
and the behavior identification unit is used for inputting the dynamic skeleton model into the convolution layer of the first behavior identification model to perform space-time convolution processing so as to obtain a behavior identification result of the pet to be identified.
14. The apparatus of claim 11, further comprising:
the target detection module is used for carrying out target detection processing on each frame of image in the image frame sequence to obtain a detection frame corresponding to each frame of image, the detection frame is used for framing out the area of the pet to be identified in the image, and one detection frame contains the pet to be identified;
the key point detection module comprises:
and the key point detection submodule is used for respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence to obtain skeleton key point information corresponding to each pet to be identified contained in the image.
15. The apparatus of claim 14, wherein the behavior recognition module comprises:
the pet skeleton map construction sub-module is used for constructing a pet skeleton map of each pet to be identified according to the position information of the skeleton key points of each pet to be identified, wherein the position information of the skeleton key points of each pet to be identified is contained in each frame of image frame sequence, the pet skeleton map comprises all the skeleton key points of the pet to be identified, and two adjacent skeleton key points are connected through a straight line;
and the second behavior recognition submodule is used for sequentially inputting the pet skeleton diagram of the pet to be recognized in each frame of image into a second behavior recognition model according to the arrangement sequence of the images in the image frame sequence to perform behavior recognition processing so as to obtain a behavior recognition result of the pet to be recognized.
16. The apparatus of claim 14, wherein the object detection module comprises:
the target detection submodule is used for carrying out joint processing on each frame of image in the image frame sequence based on a target detection model and a tracking model to obtain a detection frame corresponding to each frame of image; the target detection model is used for carrying out target detection processing on the image, and the tracking model is used for correcting deviation occurring in the process of carrying out target detection processing on the target detection model.
17. The apparatus of claim 14, wherein the keypoint detection sub-module comprises:
the key point detection unit is used for respectively carrying out pet skeleton key point detection processing on each detection frame contained in the image aiming at each frame of image in the image frame sequence and determining the number of detected skeleton key points of each pet to be identified;
the information determining unit is used for determining the skeleton key point information corresponding to the pet to be identified according to the position of each skeleton key point of the pet to be identified in the first image if the number of the skeleton key points of the pet to be identified in the first image is greater than or equal to a preset point threshold value; the first image is any one frame image in the image frame sequence;
and the image deleting unit is used for determining that the detection of the pet bone key points of the pet to be identified in the first image fails if the number of the bone key points corresponding to the pet to be identified in the first image is less than a preset point threshold value, and deleting the first image from the image frame sequence.
18. The apparatus of claim 17, wherein the keypoint detection sub-module further comprises:
and the identification stopping unit is used for determining that the detection of the pet bone key points aiming at the image frame sequence fails and stopping the behavior identification processing of the pet to be identified if the number of the image frames contained in the image frame sequence is less than a preset frame number threshold value.
19. The apparatus according to any one of claims 11 to 18, wherein the skeletal key points of the pet to be identified comprise a limb key point, a trunk key point and a head key point; the limb key points comprise 4 paw key points, 4 elbow key points and 4 knee key points; the torso key points include: buttock key points and neck key points; the head key points include left and right eye key points and a nose key point.
20. The apparatus of claim 11, further comprising:
the system comprises a sample set construction module, a detection module and a recognition module, wherein the sample set construction module is used for constructing a training sample set aiming at a pet to be recognized, and the training sample set comprises an image sample of the pet to be recognized and marking information corresponding to the image sample;
the iterative training module is used for carrying out iterative training on the deep learning model according to the training sample set to obtain a processing result corresponding to the image sample;
the parameter adjusting module is used for calculating a loss value of the deep learning model according to the processing result and the labeling information, and adjusting model parameters of the deep learning model according to the loss value until a preset termination condition is met to obtain a trained deep learning model;
the deep learning model comprises any one of a first behavior recognition model, a second behavior recognition model, a target detection model and a tracking model.
21. A machine-readable storage medium having stored thereon instructions which, when executed by one or more processors of a device, cause the device to perform a pet behavior identification method according to any one of claims 1 to 10.
CN202211006137.9A 2022-08-22 2022-08-22 Pet behavior identification method and device and readable storage medium Pending CN115083022A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211006137.9A CN115083022A (en) 2022-08-22 2022-08-22 Pet behavior identification method and device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211006137.9A CN115083022A (en) 2022-08-22 2022-08-22 Pet behavior identification method and device and readable storage medium

Publications (1)

Publication Number Publication Date
CN115083022A true CN115083022A (en) 2022-09-20

Family

ID=83244326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211006137.9A Pending CN115083022A (en) 2022-08-22 2022-08-22 Pet behavior identification method and device and readable storage medium

Country Status (1)

Country Link
CN (1) CN115083022A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116884034A (en) * 2023-07-10 2023-10-13 中电金信软件有限公司 Object identification method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985259A (en) * 2018-08-03 2018-12-11 百度在线网络技术(北京)有限公司 Human motion recognition method and device
CN112183153A (en) * 2019-07-01 2021-01-05 中国移动通信集团浙江有限公司 Object behavior detection method and device based on video analysis
CN112528891A (en) * 2020-12-16 2021-03-19 重庆邮电大学 Bidirectional LSTM-CNN video behavior identification method based on skeleton information
CN113763429A (en) * 2021-09-08 2021-12-07 广州市健坤网络科技发展有限公司 Pig behavior recognition system and method based on video
CN113902030A (en) * 2021-10-25 2022-01-07 郑州学安网络科技有限公司 Behavior identification method and apparatus, terminal device and storage medium
CN114898467A (en) * 2022-05-26 2022-08-12 华南师范大学 Human motion action recognition method, system and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985259A (en) * 2018-08-03 2018-12-11 百度在线网络技术(北京)有限公司 Human motion recognition method and device
CN112183153A (en) * 2019-07-01 2021-01-05 中国移动通信集团浙江有限公司 Object behavior detection method and device based on video analysis
CN112528891A (en) * 2020-12-16 2021-03-19 重庆邮电大学 Bidirectional LSTM-CNN video behavior identification method based on skeleton information
CN113763429A (en) * 2021-09-08 2021-12-07 广州市健坤网络科技发展有限公司 Pig behavior recognition system and method based on video
CN113902030A (en) * 2021-10-25 2022-01-07 郑州学安网络科技有限公司 Behavior identification method and apparatus, terminal device and storage medium
CN114898467A (en) * 2022-05-26 2022-08-12 华南师范大学 Human motion action recognition method, system and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SIJIE YAN 等: "Spatial Temporal Graph Convolutional Networks for Skeleton Based Action Recognition", 《ARXIV:1801.07455V2》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116884034A (en) * 2023-07-10 2023-10-13 中电金信软件有限公司 Object identification method and device

Similar Documents

Publication Publication Date Title
CN108922622B (en) Animal health monitoring method, device and computer readable storage medium
WO2023060777A1 (en) Pig body size and weight estimation method based on deep learning
WO2021000423A1 (en) Pig weight measurement method and apparatus
CN111368766B (en) Deep learning-based cow face detection and recognition method
CN109886192A (en) A kind of ecological environment intelligent monitor system
CN109141248A (en) Pig weight measuring method and system based on image
CN112862757A (en) Weight evaluation system based on computer vision technology and implementation method
CN109902681B (en) User group relation determining method, device, equipment and storage medium
CN107918629A (en) The correlating method and device of a kind of alarm failure
CN112070071B (en) Method and device for labeling objects in video, computer equipment and storage medium
CN110334593A (en) Pet recognition algorithms and system
Noe et al. Automatic detection and tracking of mounting behavior in cattle using a deep learning-based instance segmentation model
CN115880558A (en) Farming behavior detection method and device, electronic equipment and storage medium
CN115083022A (en) Pet behavior identification method and device and readable storage medium
CN111178201A (en) Human body sectional type tracking method based on OpenPose posture detection
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
KR102332252B1 (en) Apparatus and method for analyzing oestrus behavior pattern of ruminant animal based on image analysis
CN114663917A (en) Multi-view-angle-based multi-person three-dimensional human body pose estimation method and device
CN116263949A (en) Weight measurement method, device, equipment and storage medium
CN111079617A (en) Poultry identification method and device, readable storage medium and electronic equipment
Riego del Castillo et al. Estimation of lamb weight using transfer learning and regression
CN110378515A (en) A kind of prediction technique of emergency event, device, storage medium and server
CN115457036B (en) Detection model training method, intelligent point counting method and related equipment
CN113255408B (en) Behavior recognition method, behavior recognition device, electronic equipment and storage medium
CN116052088B (en) Point cloud-based activity space measurement method, system and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220920

RJ01 Rejection of invention patent application after publication