CN116259072A - Animal identification method, device, equipment and storage medium - Google Patents

Animal identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN116259072A
CN116259072A CN202310037188.6A CN202310037188A CN116259072A CN 116259072 A CN116259072 A CN 116259072A CN 202310037188 A CN202310037188 A CN 202310037188A CN 116259072 A CN116259072 A CN 116259072A
Authority
CN
China
Prior art keywords
target
animal
behavior
determining
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310037188.6A
Other languages
Chinese (zh)
Other versions
CN116259072B (en
Inventor
王尔康
周松河
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUARUI YANNENG TECHNOLOGY (SHENZHEN) CO LTD
Original Assignee
HUARUI YANNENG TECHNOLOGY (SHENZHEN) CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUARUI YANNENG TECHNOLOGY (SHENZHEN) CO LTD filed Critical HUARUI YANNENG TECHNOLOGY (SHENZHEN) CO LTD
Priority to CN202310037188.6A priority Critical patent/CN116259072B/en
Publication of CN116259072A publication Critical patent/CN116259072A/en
Application granted granted Critical
Publication of CN116259072B publication Critical patent/CN116259072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/457Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by analysing connectivity, e.g. edge linking, connected component analysis or slices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/70Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in livestock or poultry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to the field of animal detection and identification, and in particular, to a method, apparatus, device, and storage medium for animal identification, where the method includes obtaining a plurality of target images, where the target images include a target animal; performing feature extraction on each target image based on the feature extraction model to obtain a plurality of gesture features of the target animal; determining a set of actions of the target animal based on the plurality of gesture features, the set of actions including at least one action of the target animal, each action consisting of at least one gesture feature; acquiring audio information of a target animal; determining animal behavior based on the audio information and the set of actions; and acquiring explanation information corresponding to the animal behaviors. The method has the effect of facilitating more accurate understanding of animal behaviors.

Description

Animal identification method, device, equipment and storage medium
Technical Field
The present application relates to the field of animal detection and identification, and in particular, to an animal identification method, apparatus, device, and storage medium.
Background
Guests are able to enjoy a variety of animals while shopping in zoos, but often are unable to understand the behavior of the animals, making the experience worse. Therefore, how to understand the behavior of animals more accurately is a problem to be solved.
Disclosure of Invention
In order to facilitate a more accurate understanding of animal behavior, the present application provides an animal identification method, apparatus, device, and storage medium.
In a first aspect, the present application provides a method, an apparatus, a device, and a storage medium for identifying animals, which adopt the following technical schemes:
a method of animal identification comprising:
acquiring a plurality of target images, wherein the target images comprise target animals;
performing feature extraction on each target image based on a feature extraction model to obtain a plurality of gesture features of the target animal;
determining a set of actions of the target animal based on the plurality of gesture features, the set of actions including at least one action of the target animal, each action consisting of at least one gesture feature;
acquiring audio information of a target animal;
determining animal behavior based on the audio information and the set of actions;
and acquiring explanation information corresponding to the animal behaviors.
By adopting the technical scheme, a plurality of gesture features of the target animal are obtained by identifying the animals in the target images, the action set of the target animal is determined based on the gesture features, the behavior of the target animal is determined based on the action set and the audio information of the target animal, so that more accurate animal behaviors are determined conveniently, after the animal behaviors are determined, explanation information is acquired based on the animal behaviors, and further, the user can understand the behaviors of the target animal conveniently.
In one possible implementation manner, the feature extraction model is used for extracting features from each target image to obtain a plurality of gesture features of the target animal, and the method includes:
performing target recognition on each target image, and determining the position of each target animal in each target image;
determining whether target animals contacted with each other exist in each image based on the positions of the target animals in each target image;
determining any target image of the target animal in contact with each other as a marker image, and determining any target image of the target animal in non-contact with each other as a conventional image;
if the number of the marked images is greater than or equal to the preset number, performing image segmentation on each marked image to obtain sub-images corresponding to each marked image, wherein each sub-image comprises all target animals or one target animal which are in contact with each other;
image segmentation is carried out on each conventional image to obtain sub-images corresponding to each conventional image, and each sub-image comprises a target animal;
and carrying out feature extraction on all the sub-images corresponding to the conventional images and all the sub-images corresponding to the marked images based on a feature extraction model to obtain a plurality of gesture features corresponding to each target animal.
By adopting the technical scheme, if the number of the marked images is greater than or equal to the preset number, the behavior of the animals contacting with each other is represented to be credible, the marked images have reference value, and animals contacting with each other in the marked images are segmented at the moment, so that the target animals contacting with each other exist in one sub-image, and more accurate gesture feature extraction is facilitated.
In one possible implementation manner, before feature extraction is performed on each target image based on the feature extraction model, the method further includes:
performing target identification based on the target image, and determining the category of the target animal;
and determining a feature extraction model corresponding to the target animal based on the category.
In one possible implementation, the determining the set of actions of the target animal based on the plurality of gesture features includes:
acquiring a corresponding historical action library based on the category of the target animal, wherein the historical action library comprises a plurality of historical actions of the target animal and mark gestures corresponding to each historical action, and at least one mark gesture feature exists in gesture features of each historical action;
determining the gesture feature overlapped with the marked gesture feature in the historical action library from a plurality of gesture features currently acquired by the target animal, and taking the gesture feature as a target gesture feature;
Determining historical actions corresponding to the target gesture features as target actions;
and determining an action set based on each target action.
In one possible implementation, the determining animal behavior based on the audio information and the set of actions includes:
screening the actions in the action set, deleting preset normal actions, and obtaining an effective action set;
determining a first behavior of the target animal based on the set of valid actions;
determining a second behavior of the target animal based on the audio information:
scoring the first behavior and the second behavior respectively based on the current moment and a preset scoring model to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior, wherein the scoring model is obtained by training each behavior frequency and each behavior historical occurrence time based on the rule of each behavior of the target animal;
animal behavior is determined from the first behavior and the second behavior based on the first score and the second score.
In one possible implementation, the determining animal behavior from the first behavior and the second behavior based on the first score and the second score includes:
determining the higher scoring behavior of the first score and the second score as animal behavior;
If the first score and the second score are both larger than the preset threshold, determining that the animal behaviors comprise a first behavior and a second behavior;
and if the first score and the second score are not larger than the preset threshold, determining that the second behavior is animal behavior.
In one possible implementation manner, the acquiring the explanation information corresponding to the animal behavior includes:
determining an information format based on the animal behavior and the category;
and acquiring explanation information corresponding to the animal behaviors based on the information format.
By adopting the technical proposal, the utility model has the advantages that,
in a second aspect, the present application provides an animal identification device, which adopts the following technical scheme:
an animal identification device comprising:
the target image acquisition module is used for acquiring a plurality of target images, wherein the target images comprise target animals;
the feature extraction module is used for carrying out feature extraction on each target image based on the feature extraction model to obtain a plurality of gesture features of the target animal;
an action set determination module for determining an action set of the target animal based on the plurality of gesture features, the action set comprising at least one action of the target animal, each action consisting of at least one gesture feature;
The audio information acquisition module is used for acquiring the audio information of the target animal;
an animal behavior determination module for determining animal behavior based on the audio information and the set of actions;
and the explanation information acquisition module is used for acquiring explanation information corresponding to the animal behaviors.
By adopting the technical scheme, the device obtains a plurality of gesture features of the target animal by identifying the animal in the plurality of target images, determines the action set of the target animal based on the gesture features, determines the behavior of the target animal based on the action set and the audio information of the target animal together, and further facilitates determining more accurate animal behaviors, and after determining the animal behaviors, obtains explanation information based on the animal behaviors, thereby facilitating the understanding of the behaviors of the target animal by a user
In one possible implementation manner, when the feature extraction module performs feature extraction on each target image based on the feature extraction model to obtain a plurality of gesture features of the target animal, the feature extraction module is specifically configured to:
performing target recognition on each target image, and determining the position of each target animal in each target image;
determining whether target animals contacted with each other exist in each image based on the positions of the target animals in each target image;
Determining any target image of the target animal in contact with each other as a marker image, and determining any target image of the target animal in non-contact with each other as a conventional image;
if the number of the marked images is greater than or equal to the preset number, performing image segmentation on each marked image to obtain sub-images corresponding to each marked image, wherein each sub-image comprises all target animals or one target animal which are in contact with each other;
image segmentation is carried out on each conventional image to obtain sub-images corresponding to each conventional image, and each sub-image comprises a target animal;
and carrying out feature extraction on all the sub-images corresponding to the conventional images and all the sub-images corresponding to the marked images based on a feature extraction model to obtain a plurality of gesture features corresponding to each target animal.
In one possible implementation, the apparatus further includes:
the category determining module is used for carrying out target recognition based on the target image and determining the category of the target animal;
and the extraction model determining module is used for determining a characteristic extraction model corresponding to the target animal based on the category.
In one possible implementation, when the action set determination module is determining the action set of the target animal based on the plurality of gesture features, it is specifically configured to:
Acquiring a corresponding historical action library based on the category of the target animal, wherein the historical action library comprises a plurality of historical actions of the target animal and mark gestures corresponding to each historical action, and at least one mark gesture feature exists in gesture features of each historical action;
determining the gesture feature overlapped with the marked gesture feature in the historical action library from a plurality of gesture features currently acquired by the target animal, and taking the gesture feature as a target gesture feature;
determining historical actions corresponding to the target gesture features as target actions;
and determining an action set based on each target action.
In one possible implementation, when the animal behavior determination module is determining animal behavior based on the audio information and the set of actions, it is specifically configured to:
screening the actions in the action set, deleting preset normal actions, and obtaining an effective action set;
determining a first behavior of the target animal based on the set of valid actions;
determining a second behavior of the target animal based on the audio information:
scoring the first behavior and the second behavior respectively based on the current moment and a preset scoring model to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior, wherein the scoring model is obtained by training each behavior frequency and each behavior historical occurrence time based on the rule of each behavior of the target animal;
Animal behavior is determined from the first behavior and the second behavior based on the first score and the second score.
In one possible implementation, when the animal behavior determination module is determining animal behavior from the first behavior and the second behavior based on the first score and the second score, it is specifically configured to:
determining the higher scoring behavior of the first score and the second score as animal behavior;
if the first score and the second score are both larger than the preset threshold, determining that the animal behaviors comprise a first behavior and a second behavior;
and if the first score and the second score are not larger than the preset threshold, determining that the second behavior is animal behavior.
In one possible implementation manner, when the explanation information obtaining module obtains the explanation information corresponding to the animal behavior, the explanation information obtaining module is specifically configured to:
determining an information format based on the animal behavior and the category;
and acquiring explanation information corresponding to the animal behaviors based on the information format.
In a third aspect, the present application provides an electronic device, which adopts the following technical scheme:
an electronic device, the electronic device comprising:
at least one processor;
A memory;
at least one application, wherein the at least one application is stored in memory and configured to be executed by at least one processor, the at least one application configured to: the above animal identification method is performed.
In a fourth aspect, the present application provides a computer readable storage medium, which adopts the following technical scheme:
a computer-readable storage medium, comprising: a computer program is stored which can be loaded by a processor and which performs the above-described animal identification method.
In summary, the present application includes at least one of the following beneficial technical effects:
1. the method comprises the steps of obtaining a plurality of gesture features of a target animal by identifying the animal in a plurality of target images, determining an action set of the target animal based on the gesture features, and jointly determining the behavior of the target animal based on the action set and the audio information of the target animal, so that more accurate animal behavior can be determined conveniently, after the animal behavior is determined, explanation information can be obtained based on the animal behavior, and further, the user can understand the behavior of the target animal conveniently;
2. if the number of the marked images is greater than or equal to the preset number, the behavior of the animals contacting with each other is trusted, the marked images have reference value, and the animals contacting with each other in the marked images are segmented at the moment, so that the target animals contacting with each other exist in one sub-image, and more accurate extraction of the gesture features is facilitated.
Drawings
FIG. 1 is a schematic flow chart of an animal identification method in an embodiment of the present application;
FIG. 2 is a schematic diagram of an animal identification device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
The present application is described in further detail below in conjunction with figures 1-3.
Modifications of the embodiments which do not creatively contribute to the invention may be made by those skilled in the art after reading the present specification, but are protected by patent laws only within the scope of claims of the present application.
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In addition, the term "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In this context, unless otherwise specified, the term "/" generally indicates that the associated object is an "or" relationship.
An embodiment of the present application provides an animal identifying method, which is executed by an electronic device, referring to fig. 1, and includes steps S01-S06, where:
step S01, a plurality of target images are acquired, wherein the target images comprise target animals.
In the embodiment of the present application, the target image may be acquired separately, that is, a plurality of images may be photographed as the target image for the target animal; an image may be extracted from a piece of video information as a target image. Wherein, only one target animal of the same kind can exist in the target image, and a plurality of target animals of the same kind can also exist.
And step S02, carrying out feature extraction on each target image based on the feature extraction model to obtain a plurality of gesture features of the target animal.
In the embodiment of the application, the gesture feature represents a gesture of the target animal, and if only one same target animal exists in each target image, the gesture feature of the target animal can be obtained after feature extraction is performed on each target image.
Step S03, determining an action set of the target animal based on a plurality of gesture features, wherein the action set at least comprises one action of the target animal, and each action at least comprises one gesture feature.
In the embodiment of the application, the animal's behavior should be characterized by at least one action, for example, the gorilla's behavior of generating qi may be represented by two actions of beating the chest and roaring, and the peacock's behavior may be represented by the action of spreading the tail screen. Each action should include at least one gesture, e.g. peacock spreading can be determined by one gesture feature, but a gorilla chest beating action requires multiple gestures to compose a complete chest beating action. A set of actions for a target animal should include at least one action determined by the gesture feature, or a plurality of actions.
And S04, acquiring the audio information of the target animal.
The animal's behavior is usually represented as both motion and sound, and in the present embodiment, the audio information may be obtained separately or extracted from a video file.
Step S05, determining animal behaviors based on the audio information and the action set.
In the embodiment of the application, the animal behaviors of the target animal are determined through the audio information and the action set, so that the accuracy is higher.
And step S06, acquiring explanation information corresponding to animal behaviors.
After determining the animal behavior, the explanation information is acquired for the animal behavior, so that the user can understand and learn the explanation information conveniently, wherein the format of the explanation information is not specifically limited in the embodiment of the application, and can be, for example, an audio format, or one or more of a text format and an image format.
Furthermore, since the posture feature extraction model is extracted based on the body type structure and the position of the joint point, if animals of different types adopt one feature extraction model to extract the posture features, the accuracy of the finally obtained posture features is poor. Therefore, before feature extraction is performed on each target image based on the feature extraction model, it is necessary to determine a feature extraction model corresponding to the target animal.
The method comprises the steps of obtaining a plurality of gesture features of a target animal by identifying the animal in a plurality of target images, determining an action set of the target animal based on the gesture features, and jointly determining the behavior of the target animal based on the action set and the audio information of the target animal, so that more accurate animal behavior is determined, after animal behavior is determined, explanation information is obtained based on the animal behavior, and further the user can understand the behavior of the target animal conveniently.
The animal identifying method further includes a step SA1 (not shown in the figure) and a step SA2 (not shown in the figure), and is performed before the step of feature extracting each target image based on the feature extraction model, wherein:
and step SA1, carrying out target recognition based on the target image, and determining the category of the target animal.
Specifically, the class of the target animal can be obtained by a trained class identification model, and for example, can be any one of a fast R-CNN model, an SSD model, and a YOLO model.
Further, the category of the target animal may be input by the user, or the association relationship between each region and the category of the animal may be preset, the current location information of the user may be obtained by using a positioning device or other manners, the region where the user is located may be determined based on the location information, and then the category of the target animal may be determined based on the region where the user is located.
And step SA2, determining a feature extraction model corresponding to the target animal based on the category.
Specifically, the gesture feature extraction model is trained in advance for each type of animal, and after determining the type of the target animal in the target image, the feature extraction model is determined based on the type of the target animal. And extracting the gesture features of the target animal in the target image through the feature extraction model corresponding to the target animal, so that the accuracy can be improved.
Further, feature extraction is performed on each target image based on the feature extraction model, so as to obtain a plurality of gesture features of the target animal, including a step S021 (not shown in the figure) -a step S026 (not shown in the figure), wherein:
And step S021, carrying out target identification on each target image, and determining the position of each target animal in each target image.
Specifically, there may be only one target animal or a plurality of target animals in each target image. And carrying out target recognition on each target image, and determining the position of each target animal in each target image in the image. The lower left corner of each target image is taken as an origin to establish a two-dimensional coordinate system for describing the position of any pixel point in the target image, the outline of each target animal in the target image is firstly identified and determined, and the position of any target animal in the image can be represented by the coordinates of at least 5 pixel points on the outline of the target animal in the image.
Step S022, determining whether the target animals contacted with each other exist in each image based on the positions of the target animals in each target image.
Further, whether or not any two target animals are in contact in the same target image can be determined by comparing at least 5 coordinate points characterizing the position of each target animal in the image. Namely, connecting the tail ends of 5 pixel points representing the positions of any target animal to obtain a closed graph, comparing the graphs corresponding to any two target animals, and if the two graphs have a poor coincident point, indicating that the target animals corresponding to the two graphs are in contact with each other.
Step S023, determining any target image of the target animal in which the mutual contact exists as a marker image, and determining any target image of the target animal in which the mutual contact does not exist as a normal image.
Specifically, if it is determined that there are target animals in contact with each other in any one of the target images, a preset label is associated with any one of the target images, and as a marker image, it is determined that the target image in which there are no target animals in contact with each other is marked as a normal image.
The behavior of the solitary animal and the social animal may be different, and further, even for animals of the same class, the meaning of behavior characterization may be different when a target animal alone performs an a action and when multiple animals do the same a action with each other. When only one target animal exists in the target image, only the gesture feature of the one target animal is extracted; however, if there are target animals in contact with each other, it is necessary to determine the common pose of the two target animals; for example, determining the pose of only one target animal is not able to determine that two target animals are being shelved, where it is necessary to determine the pose that the two target animals are in common.
In step S024, if the number of the marked images is greater than or equal to the preset number, image segmentation is performed on each marked image to obtain sub-images corresponding to each marked image, where each sub-image includes all target animals or one target animal that are in contact with each other.
Specifically, if the preset number is determined based on the total number of the obtained target images, that is, the preset number is a preset percentage of the total number of the obtained target images, for example, the preset percentage is 40%, and if the total number of the obtained target images is 20, the preset number is 8. If the number of the marked images is smaller than the preset number, the animal behaviors in contact with each other do not have reference value, and when the gestures of the target animals in the marked images are extracted, only one target animal is contained in each sub-image for segmentation, namely, only the independent gesture features corresponding to each target animal are extracted, and the common gesture features of the two target animals in contact with each other are not required to be extracted.
When the number of the marked images is greater than the preset number, the animal behaviors contacting each other have reference value, and at this time, each marked image is divided, and the animals contacting each other should be divided into the same sub-image, that is, the sub-images obtained by dividing the marked images, wherein each sub-image includes all the target animals contacting each other or one target animal.
Step S025, image segmentation is carried out on each conventional image, so as to obtain sub-images corresponding to each conventional image, wherein each sub-image comprises a target animal.
Specifically, when the conventional image is subjected to image segmentation, only one target animal is included in each sub-image.
And step S026, carrying out feature extraction on the sub-images corresponding to all the conventional images and the sub-images corresponding to all the marked images based on the feature extraction model to obtain a plurality of gesture features corresponding to each target animal.
Specifically, extracting features of the extracted sub-images, wherein the extracted sub-images only comprise a target animal, and the extracted sub-images are a gesture feature; if the sub-image comprises at least two target animals which are in contact with each other, the gesture characteristics corresponding to the target animals in the sub-image are obtained through characteristic extraction, and at the moment, a plurality of gesture characteristics obtained from the same sub-image are directly related. Finally, the corresponding gesture features of the animals and the gesture features of the target animals with association relationship are obtained.
Further, determining the action set of the target animal based on all the gesture features may include step S031 (not shown in the figure) -step S034 (not shown in the figure), wherein:
step S031, obtaining a corresponding historical action library based on the category of the target animal, wherein the historical action library comprises a plurality of historical actions of the target animal and sign gestures corresponding to each historical action, and at least one sign gesture feature exists in gesture features of each historical action.
Specifically, corresponding historical action libraries are established for a plurality of types of target animals in advance, each historical action library comprises a plurality of historical actions of the corresponding type of target animals and mark gestures corresponding to each historical action, and at least one mark gesture feature exists in gesture features of each historical action. At least one sign gesture corresponding to each historical action is determined by means of overlapping comparison based on all gestures corresponding to a plurality of identical actions, for example, 10 puppet actions of an animal in category A are obtained, each action comprises a plurality of gestures, 10 groups of gestures are subjected to overlapping comparison, and gestures existing in the 10 groups of gestures are determined to be sign gestures corresponding to the puppet actions of the animal in category A.
Step S032, determining the gesture feature overlapped with the mark gesture feature in the history action library from a plurality of gesture features currently acquired by the target animal as a target gesture feature;
step S033, determining the historical actions corresponding to the target gesture features as target actions.
Specifically, in the present application, only one target animal is illustrated in all target images, and after a plurality of gesture features corresponding to the target animal are obtained, it is difficult to determine which of the adjacent gesture features is a group of actions, and many random combinations are performed, so that not only is the calculation amount complex, but also the accuracy is low.
In the embodiment of the application, comparing all the obtained gesture features of the target animal with the mark gesture of each action in the historical action library corresponding to the target animal, namely if any action corresponding mark gesture appears in the currently acquired gesture features of the target animal, determining that any action is made by the target animal in the target image.
Step S034, determining an action set based on each target action.
Specifically, after determining the target actions corresponding to the target animals, all the target actions are determined as one set, that is, the action set of the target animals.
Further, determining animal behavior based on the audio information and the set of actions may include step S051 (not shown in the figure) -step S055 (not shown in the figure), wherein:
step S051, screening the actions in the action set, deleting the preset normal actions, and obtaining an effective action set;
step S052, determining the first behavior of the target animal based on the effective action set.
Specifically, each animal has corresponding habitual actions and nonsensical actions, invalid actions of each type of animal are predetermined, after an action set of a target animal is determined, the action set is screened, and invalid actions in the action set are deleted, so that an effective action set is obtained. A first behavior of the animal is determined based on each of the set of valid actions, the first behavior describing only behavior characterized by the animal's actions.
Step S053, determining a second behavior of the target animal based on the audio information.
Specifically, a second behavior of the target animal is determined from the audio information corresponding to the target animal, the second behavior describing only the behavior characterized by the audio of the animal.
And step S054, respectively scoring the first behavior and the second behavior based on the current moment and a preset scoring model to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior, wherein the scoring model is trained based on the rule of each behavior of the target animal, the frequency of each behavior and the historical occurrence time of each behavior.
Specifically, a scoring model corresponding to each class of animal is preset, wherein the scoring model is used for evaluating the scores of the first behavior and the second behavior of the animal, and the scores represent the confidence degrees of the corresponding behaviors. And scoring the first behavior and the second behavior respectively to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior.
Further, the scoring model may be for the animal of the category used, or a corresponding scoring model may be set for each category of animal individually. In the embodiment of the application, setting a corresponding scoring model for each class of animal individually as an example, wherein the corresponding relation between each behavior of the animal in the corresponding class and a time line in a historical test period is obtained, so that the occurrence rule and the frequency of each behavior of the corresponding animal are obtained, and further, a statistical graph of normal distribution compliance of each behavior of the animal in one day is determined, or a normal distribution statistical graph of each behavior of the animal in one day is determined; and constructing a scoring model based on a statistical graph of normal distribution compliance of each behavior of the animal in one day or a normal distribution statistical graph corresponding to each behavior of the animal in one day. The scoring model evaluates the probability that the first behavior and the second behavior of the target animal appear at the current moment respectively, namely the scores corresponding to the first behavior and the second behavior respectively.
Step S055, determining animal behavior from the first behavior and the second behavior based on the first score and the second score.
In fact, the behaviour of the animal and the behaviour characterised by the audio information may not be the same, i.e. there is a situation where there may be a mismatch between the first behaviour and the second behaviour of the animal. For example, the animals may be the same in terms of crying during an exciting behavior and an angry, but the specific actions may be different; of course, it is also possible that the actions are the same and the sounds are different in different behaviors.
Further, when determining animal behaviors from the first behaviors and the second behaviors based on the first scores and the second scores, determining that the animal behaviors comprise the first behaviors and the second behaviors if the first scores and the second scores are both larger than a preset threshold; and if the first score and the second score are not larger than the preset threshold, determining that the second behavior is animal behavior.
Further, if one of the first score and the second score is greater than a preset threshold value and the other score is less than or equal to the preset threshold value, determining the behavior with the higher corresponding score as the animal behavior.
In the embodiment of the present application, the preset threshold is not specifically limited, as long as accuracy of animal behavior is more accurately determined.
Further, when the explanation information corresponding to the animal behaviors is acquired, the information format is determined based on the animal behaviors and the categories, and the explanation information corresponding to the animal behaviors is acquired based on the information format.
The format of the explanation information is determined based on the current animal behaviors so as to reduce the influence on the animal behaviors and facilitate better experience for users. For example, if the animal is in heat, the emotion may be violent, and if the explanation information in the audio format is acquired and played, the animal may be violent, and at this time, the image and the text information may be acquired.
The above embodiments describe an animal identification method from the viewpoint of the flow of the method, and the following embodiments describe an animal identification device from the viewpoint of a virtual module or a virtual unit, and the following embodiments are described in detail.
The embodiment of the application provides an animal identifying device, as shown in fig. 2, the device 200 may specifically include a target image obtaining module 21, a feature extracting module 22, an action set determining module 23, an audio information obtaining module 24, an animal behavior determining module 25, and an explanation information obtaining module 26, where:
a target image acquisition module 21 for acquiring a plurality of target images including a target animal;
The feature extraction module 22 is configured to perform feature extraction on each target image based on the feature extraction model, so as to obtain a plurality of gesture features of the target animal;
an action set determining module 23 for determining an action set of the target animal based on the plurality of gesture features, the action set including at least one action of the target animal, each action being composed of at least one gesture feature;
an audio information acquisition module 24 for acquiring audio information of the target animal;
an animal behavior determination module 25 for determining animal behaviors based on the audio information and the action set;
the explanation information obtaining module 26 is configured to obtain explanation information corresponding to animal behaviors.
In one possible implementation, when the feature extraction module 22 performs feature extraction on each target image based on the feature extraction model to obtain multiple pose features of the target animal, the feature extraction module is specifically configured to:
performing target recognition on each target image, and determining the position of each target animal in each target image;
determining whether there are target animals in contact with each other in each image based on the positions of the respective target animals in each target image;
determining any target image of the target animal in contact with each other as a marker image, and determining any target image of the target animal in non-contact with each other as a conventional image;
If the number of the marked images is greater than or equal to the preset number, image segmentation is carried out on each marked image to obtain sub-images corresponding to each marked image, and each sub-image comprises all target animals or one target animal which are in contact with each other;
image segmentation is carried out on each conventional image to obtain sub-images corresponding to each conventional image, and each sub-image comprises a target animal;
and carrying out feature extraction on the sub-images corresponding to all the conventional images and the sub-images corresponding to all the marked images based on the feature extraction model to obtain a plurality of gesture features corresponding to each target animal.
In one possible implementation, the apparatus 20 further includes:
the category determining module is used for carrying out target recognition based on the target image and determining the category of the target animal;
and the extraction model determining module is used for determining a characteristic extraction model corresponding to the target animal based on the category.
In one possible implementation, when the action set determination module 23 is determining the action set of the target animal based on the plurality of gesture features, it is specifically configured to:
acquiring a corresponding historical action library based on the category of the target animal, wherein the historical action library comprises a plurality of historical actions of the target animal and mark gestures corresponding to each historical action, and at least one mark gesture feature exists in gesture features of each historical action;
Determining the gesture feature overlapped with the mark gesture feature in the historical action library from a plurality of gesture features currently acquired by the target animal, and taking the gesture feature as the target gesture feature;
determining historical actions corresponding to the target gesture features as target actions;
a set of actions is determined based on the respective target actions.
In one possible implementation, when the animal behavior determination module 25 is determining animal behavior based on the audio information and the set of actions, it is specifically configured to:
screening the actions in the action set, deleting preset normal actions, and obtaining an effective action set;
determining a first behavior of the target animal based on the set of valid actions;
determining a second behavior of the target animal based on the audio information:
respectively scoring the first behavior and the second behavior based on the current moment and a preset scoring model to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior, wherein the scoring model is obtained by training each behavior frequency and each behavior historical occurrence time based on the rule of each behavior of the target animal;
animal behavior is determined from the first behavior and the second behavior based on the first score and the second score.
In one possible implementation, when the animal behavior determination 25 module is determining animal behavior from the first behavior and the second behavior based on the first score and the second score, it is specifically configured to:
Determining the behavior with the higher score in the first score and the second score as animal behavior;
if the first score and the second score are both larger than the preset threshold, determining that the animal behaviors comprise a first behavior and a second behavior;
and if the first score and the second score are not larger than the preset threshold, determining that the second behavior is animal behavior.
In one possible implementation, when the explanation information obtaining module 26 is configured to obtain explanation information corresponding to animal behaviors, the explanation information obtaining module is specifically configured to:
determining an information format based on animal behavior and category;
and acquiring explanation information corresponding to the animal behaviors based on the information format.
In an embodiment of the present application, as shown in fig. 3, an electronic device 300 shown in fig. 3 includes: a processor 301 and a memory 303. Wherein the processor 301 is coupled to the memory 303, such as via a bus 302. Optionally, the electronic device 300 may also include a transceiver 304. It should be noted that, in practical applications, the transceiver 304 is not limited to one, and the structure of the electronic device 300 is not limited to the embodiment of the present application.
The processor 301 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. Processor 301 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.
Bus 302 may include a path to transfer information between the components. Bus 302 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect Standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. Bus 302 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 3, but not only one bus or one type of bus.
The Memory 303 may be, but is not limited to, a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory ), a CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The memory 303 is used for storing application program codes for executing the present application and is controlled to be executed by the processor 301. The processor 301 is configured to execute the application code stored in the memory 303 to implement what is shown in the foregoing method embodiments.
Among them, electronic devices include, but are not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. But may also be a server or the like. The electronic device shown in fig. 3 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
The present application provides a computer readable storage medium having a computer program stored thereon, which when run on a computer, causes the computer to perform the corresponding method embodiments described above.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for a person skilled in the art, several improvements and modifications can be made without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. A method of identifying an animal comprising:
acquiring a plurality of target images, wherein the target images comprise target animals;
performing feature extraction on each target image based on a feature extraction model to obtain a plurality of gesture features of the target animal;
determining a set of actions of the target animal based on the plurality of gesture features, the set of actions including at least one action of the target animal, each action consisting of at least one gesture feature;
acquiring audio information of a target animal;
determining animal behavior based on the audio information and the set of actions;
and acquiring explanation information corresponding to the animal behaviors.
2. The method for identifying an animal according to claim 1, wherein the feature extraction is performed on each of the target images based on a feature extraction model to obtain a plurality of pose features of the target animal, comprising:
Performing target recognition on each target image, and determining the position of each target animal in each target image;
determining whether target animals contacted with each other exist in each image based on the positions of the target animals in each target image;
determining any target image of the target animal in contact with each other as a marker image, and determining any target image of the target animal in non-contact with each other as a conventional image;
if the number of the marked images is greater than or equal to the preset number, performing image segmentation on each marked image to obtain sub-images corresponding to each marked image, wherein each sub-image comprises all target animals or one target animal which are in contact with each other;
image segmentation is carried out on each conventional image to obtain sub-images corresponding to each conventional image, and each sub-image comprises a target animal;
and carrying out feature extraction on all the sub-images corresponding to the conventional images and all the sub-images corresponding to the marked images based on a feature extraction model to obtain a plurality of gesture features corresponding to each target animal.
3. The method of claim 1, further comprising, prior to feature extraction of each of the target images based on a feature extraction model:
Performing target identification based on the target image, and determining the category of the target animal;
and determining a feature extraction model corresponding to the target animal based on the category.
4. A method of animal identification according to claim 3, wherein said determining a set of actions for a target animal based on said plurality of gesture features comprises:
acquiring a corresponding historical action library based on the category of the target animal, wherein the historical action library comprises a plurality of historical actions of the target animal and mark gestures corresponding to each historical action, and at least one mark gesture feature exists in gesture features of each historical action;
determining the gesture feature overlapped with the marked gesture feature in the historical action library from a plurality of gesture features currently acquired by the target animal, and taking the gesture feature as a target gesture feature;
determining historical actions corresponding to the target gesture features as target actions;
and determining an action set based on each target action.
5. A method of animal identification according to any one of claims 1 or 2, wherein said determining animal behaviour based on audio information and said set of actions comprises:
screening the actions in the action set, deleting preset normal actions, and obtaining an effective action set;
Determining a first behavior of the target animal based on the set of valid actions;
determining a second behavior of the target animal based on the audio information:
scoring the first behavior and the second behavior respectively based on the current moment and a preset scoring model to obtain a first score corresponding to the first behavior and a second score corresponding to the second behavior, wherein the scoring model is obtained by training each behavior frequency and each behavior historical occurrence time based on the rule of each behavior of the target animal;
animal behavior is determined from the first behavior and the second behavior based on the first score and the second score.
6. The method of claim 5, wherein determining animal behavior from first behavior and second behavior based on the first score and the second score comprises:
determining the higher scoring behavior of the first score and the second score as animal behavior;
if the first score and the second score are both larger than the preset threshold, determining that the animal behaviors comprise a first behavior and a second behavior;
and if the first score and the second score are not larger than the preset threshold, determining that the second behavior is animal behavior.
7. The method for identifying animals according to claim 3, wherein said obtaining of said explanation information corresponding to said animal behavior comprises:
determining an information format based on the animal behavior and the category;
and acquiring explanation information corresponding to the animal behaviors based on the information format.
8. An animal identification device, comprising:
the target image acquisition module is used for acquiring a plurality of target images, wherein the target images comprise target animals;
the feature extraction module is used for carrying out feature extraction on each target image based on the feature extraction model to obtain a plurality of gesture features of the target animal;
an action set determination module for determining an action set of the target animal based on the plurality of gesture features, the action set comprising at least one action of the target animal, each action consisting of at least one gesture feature;
the audio information acquisition module is used for acquiring the audio information of the target animal;
an animal behavior determination module for determining animal behavior based on the audio information and the set of actions;
and the explanation information acquisition module is used for acquiring explanation information corresponding to the animal behaviors.
9. An electronic device, comprising:
at least one processor;
a memory;
at least one application, wherein the at least one application is stored in memory and configured to be executed by at least one processor, the at least one application configured to: a method of performing the animal identification of any one of claims 1-7.
10. A computer-readable storage medium, comprising: a computer program stored which can be loaded by a processor and which performs the method of animal identification according to any one of claims 1-7.
CN202310037188.6A 2023-01-10 2023-01-10 Animal identification method, device, equipment and storage medium Active CN116259072B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310037188.6A CN116259072B (en) 2023-01-10 2023-01-10 Animal identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310037188.6A CN116259072B (en) 2023-01-10 2023-01-10 Animal identification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116259072A true CN116259072A (en) 2023-06-13
CN116259072B CN116259072B (en) 2024-05-10

Family

ID=86681957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310037188.6A Active CN116259072B (en) 2023-01-10 2023-01-10 Animal identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116259072B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368567A (en) * 2017-07-11 2017-11-21 深圳传音通讯有限公司 Animal language recognition methods and user terminal
CN110826358A (en) * 2018-08-08 2020-02-21 杭州海康威视数字技术股份有限公司 Animal emotion recognition method and device and storage medium
CN111461337A (en) * 2020-03-05 2020-07-28 深圳追一科技有限公司 Data processing method and device, terminal equipment and storage medium
CN111898581A (en) * 2020-08-12 2020-11-06 成都佳华物链云科技有限公司 Animal detection method, device, electronic equipment and readable storage medium
CN111914657A (en) * 2020-07-06 2020-11-10 浙江大华技术股份有限公司 Pet behavior detection method and device, electronic equipment and storage medium
CN113936231A (en) * 2021-10-08 2022-01-14 Oppo广东移动通信有限公司 Target identification method and device and electronic equipment
CN114387622A (en) * 2022-01-13 2022-04-22 深圳市商汤科技有限公司 Animal weight recognition method and device, electronic equipment and storage medium
JP2022101296A (en) * 2020-12-24 2022-07-06 Assest株式会社 Animal intention determination program

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368567A (en) * 2017-07-11 2017-11-21 深圳传音通讯有限公司 Animal language recognition methods and user terminal
CN110826358A (en) * 2018-08-08 2020-02-21 杭州海康威视数字技术股份有限公司 Animal emotion recognition method and device and storage medium
CN111461337A (en) * 2020-03-05 2020-07-28 深圳追一科技有限公司 Data processing method and device, terminal equipment and storage medium
CN111914657A (en) * 2020-07-06 2020-11-10 浙江大华技术股份有限公司 Pet behavior detection method and device, electronic equipment and storage medium
CN111898581A (en) * 2020-08-12 2020-11-06 成都佳华物链云科技有限公司 Animal detection method, device, electronic equipment and readable storage medium
JP2022101296A (en) * 2020-12-24 2022-07-06 Assest株式会社 Animal intention determination program
CN113936231A (en) * 2021-10-08 2022-01-14 Oppo广东移动通信有限公司 Target identification method and device and electronic equipment
CN114387622A (en) * 2022-01-13 2022-04-22 深圳市商汤科技有限公司 Animal weight recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116259072B (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN111476284B (en) Image recognition model training and image recognition method and device and electronic equipment
CN109117848B (en) Text line character recognition method, device, medium and electronic equipment
CN110874594B (en) Human body appearance damage detection method and related equipment based on semantic segmentation network
US10936911B2 (en) Logo detection
US11967089B2 (en) Object tracking method, tracking processing method, corresponding apparatus, and electronic device
CN112528831B (en) Multi-target attitude estimation method, multi-target attitude estimation device and terminal equipment
CN109345553B (en) Palm and key point detection method and device thereof, and terminal equipment
CN111191067A (en) Picture book identification method, terminal device and computer readable storage medium
CN108388822A (en) A kind of method and apparatus of detection image in 2 D code
CN112508835B (en) GAN-based contrast agent-free medical image enhancement modeling method
CN111695539A (en) Evaluation method and device for handwritten Chinese characters and electronic equipment
CN114022748B (en) Target identification method, device, equipment and storage medium
CN112995757B (en) Video clipping method and device
CN114168768A (en) Image retrieval method and related equipment
CN116259072B (en) Animal identification method, device, equipment and storage medium
WO2023273227A1 (en) Fingernail recognition method and apparatus, device, and storage medium
CN116310994A (en) Video clip extraction method and device, electronic equipment and medium
US20220122341A1 (en) Target detection method and apparatus, electronic device, and computer storage medium
CN109933679A (en) Object type recognition methods, device and equipment in image
US11599743B2 (en) Method and apparatus for obtaining product training images, and non-transitory computer-readable storage medium
CN116503918A (en) Palm vein image classification method, device, equipment and medium based on ViT network
CN110059180B (en) Article author identity recognition and evaluation model training method and device and storage medium
CN113297985A (en) Blackboard writing behavior detection method and device, medium and electronic equipment
CN111310702B (en) Video offset analysis method and device and electronic equipment
US20230073940A1 (en) Body Pose Tracking of Players from Sports Broadcast Video Feed

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant