CN117475480A

CN117475480A - Multi-pet feeding method and device based on image recognition

Info

Publication number: CN117475480A
Application number: CN202311666354.5A
Authority: CN
Inventors: 贾哲恒; 唐矗; 蒲立
Original assignee: Beijing Jijia Technology Co ltd
Current assignee: Beijing Jijia Technology Co ltd
Priority date: 2023-12-07
Filing date: 2023-12-07
Publication date: 2024-01-30

Abstract

The application provides a multi-pet feeding method and device based on image recognition, and belongs to the technical field of data processing. S1, continuously acquiring images of a pet feeding area; step S2, detecting a plurality of continuously input images based on a pre-trained pet detection model to obtain a pet image area containing pets; step S3, carrying out pet identity recognition on the pet image area based on a pre-trained pet identity recognition model to acquire the individual identity of the pet; and S4, determining feeding parameters matched with the individual identities of the pets, and delivering food according to the feeding parameters. The timing and quantitative feeding of a plurality of pets in families is realized, and the feeding burden of the pet owners is reduced.

Description

Multi-pet feeding method and device based on image recognition

Technical Field

The application belongs to the technical field of data processing, and particularly relates to a method and a device for feeding multiple pets based on image recognition.

Background

In order to meet the needs and convenience of pet owners, pet feeders have emerged as an innovative device. The pet food dispensing device is intended to provide an automatic feeding solution to the owner and helps to ease the workload of daily feeding. However, conventional pet feeders often suffer from drawbacks such as lack of timing, quantitative feeding functions, and the work and lifestyle of the pet owners may lead to irregular feeding schedules and food volumes, which are detrimental to the health of the pet; on the other hand, when a plurality of pets are kept in a home, it is difficult to achieve quantitative feeding for different pets, which may cause problems such as unbalanced feeding, overfeeding or the pet competing for food. These drawbacks limit the popularity and use of pet feeders in the home.

Disclosure of Invention

In order to solve the technical problems, the application provides a multi-pet feeding method and device based on image recognition, which solve the feeding problem of multi-pet families.

In a first aspect of the present application, a multi-pet feeding method based on image recognition mainly includes:

step S1, continuously acquiring images of a pet feeding area;

step S2, detecting a plurality of continuously input images based on a pre-trained pet detection model to obtain a pet image area containing pets;

step S3, carrying out pet identity recognition on the pet image area based on a pre-trained pet identity recognition model to acquire the individual identity of the pet;

and S4, determining feeding parameters matched with the individual identities of the pets, and delivering food according to the feeding parameters.

Preferably, step S1 further comprises:

step S11, determining the latest feeding time of each pet to be fed;

and step S12, setting time before the latest feeding time, starting an image acquisition module, and continuously acquiring images of the feeding area of the pet.

Preferably, step S2 further comprises:

step S21, outputting detection frame coordinates, confidence coefficient and pet prediction category containing pets based on the pet detection model;

step S22, filtering the detection frame with the confidence coefficient lower than the preset value, and reserving a pet image area containing the pet.

Preferably, before step S21, further comprising training the pet detection model by:

step S211, images with different pet types, different backgrounds, different postures and different illumination conditions are obtained to serve as training data, and labeling of pet target frames and categories is carried out on the training data;

step S212, processing the training data based on a pet detection model to be trained to obtain output containing the detection frame coordinates, confidence level and pet prediction type of the pet;

step S213, determining total loss L of the pet detection model based on the following formula _det ：

L _det = λ _coord * L _coord + λ _conf * L _conf + λ _cls * L _cls ；

Wherein L is _coord 、L _conf 、L _cls Respectively coordinate regression loss, confidence loss and classification loss, lambda _coord 、λ _conf 、λ _cls Weights of coordinate regression loss, confidence loss and classification loss respectively;

step S214, based on total loss L _det Updating the weight parameters to be optimized of the pet detection model until the weight parameters to be optimized are converged, and obtaining the final pet detection model.

Preferably, step S3 further includes:

step S31, obtaining a feature vector of a pet based on a pet identity recognition model;

and S32, calculating the distance between the feature vector and each sample feature vector in the database, and determining the sample feature vector with the minimum distance, thereby determining the individual identity of the pet.

Preferably, in step S31, a pet with an individual ID is usedModel training is carried out on the image data set, and the pet identity recognition model is jointly optimized by using cross entropy loss and triplet loss, wherein the total loss L of the pet identity recognition model _id The method comprises the following steps:

L _id = λ _CE * L _CE + λ _Triplet * L _Triplet ；

wherein L is _CE 、L _Triplet Respectively, cross entropy loss and triplet loss, lambda _CE 、λ _Triplet The weight coefficients for cross entropy loss and triplet loss, respectively.

Preferably, step S4 further comprises:

delivering the food matched with the individual identity of the pet at a set rate, and stopping delivering the food when the individual identity of the pet is not detected.

In a second aspect of the present application, a multiple pet feeding device based on image recognition, mainly comprises:

the pet feeding area image acquisition module is used for continuously acquiring images of the pet feeding area;

the pet image region acquisition module is used for detecting a plurality of continuously input images based on a pre-trained pet detection model to obtain a pet image region containing a pet;

the pet identification module is used for identifying the identity of the pet in the pet image area based on a pre-trained pet identity identification model, and obtaining the individual identity of the pet;

and the feeding module is used for determining feeding parameters matched with the individual identities of the pets and delivering foods according to the feeding parameters.

Preferably, the pet feeding region image acquisition module comprises:

the feeding time timing unit is used for determining the latest feeding time of each pet to be fed;

the image acquisition starting unit is used for setting time before the latest feeding time, starting the image acquisition module and continuously acquiring images of the pet feeding area.

Preferably, the pet image region acquisition module further includes:

the model output unit is used for outputting detection frame coordinates, confidence and pet prediction categories containing pets based on the pet detection model;

the filtering unit is used for filtering the detection frame with the confidence coefficient lower than a preset value and reserving the pet image area containing the pet.

Preferably, further comprising training the pet detection model by:

the training data labeling unit is used for acquiring images with different pet types, different backgrounds, different postures and different illumination conditions as training data, and labeling target frames and categories of the pets for the training data;

the model actual output unit is used for processing the training data based on a pet detection model to be trained to obtain output containing the detection frame coordinates, the confidence coefficient and the pet prediction category of the pet;

a loss determination unit for determining a total loss L of the pet detection model based on the following formula _det ：

L _det = λ _coord * L _coord + λ _conf * L _conf + λ _cls * L _cls ；

a loop optimization unit for based on total loss L _det Updating the weight parameters to be optimized of the pet detection model until the weight parameters to be optimized are converged, and obtaining the final pet detection model.

Preferably, the pet identification module includes:

the feature vector output unit is used for acquiring the feature vector of the pet based on the pet identity recognition model;

and the distance calculation unit is used for calculating the distance between the feature vector and each sample feature vector in the database, and determining the sample feature vector with the minimum distance, thereby determining the individual identity of the pet.

Preferably, model training is performed using a pet image dataset labeled with individual identity IDs, and a pet identification model is jointly optimized using cross entropy loss and triplet loss, the total loss L of the pet identification model _id The method comprises the following steps:

L _id = λ _CE * L _CE + λ _Triplet * L _Triplet ；

Preferably, the feeding module comprises:

and the feeding control unit is used for delivering food matched with the individual identity of the pet at a set rate and stopping feeding the food when the pet with the individual identity is not detected.

In a third aspect of the present application, a computer device comprises a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor executing the computer program for implementing the image recognition-based multi-pet feeding method as set forth in any one of the above.

In a fourth aspect of the present application, a readable storage medium stores a computer program for implementing the image recognition-based multi-pet feeding method as described above when executed by a processor.

The timing and quantitative feeding of a plurality of pets in families is realized, and the feeding burden of the pet owners is reduced.

Drawings

Fig. 1 is a flow chart of a preferred embodiment of the image recognition-based multi-pet feeding method of the present application.

Fig. 2 is a schematic structural diagram of a computer device suitable for use in implementing the terminal or server of the embodiments of the present application.

Detailed Description

For the purposes, technical solutions and advantages of the present application, the following describes the technical solutions in the embodiments of the present application in more detail with reference to the drawings in the embodiments of the present application. In the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The described embodiments are some, but not all, of the embodiments of the present application. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present application and are not to be construed as limiting the present application. All other embodiments, based on the embodiments herein, which would be apparent to one of ordinary skill in the art without undue burden are within the scope of the present application. Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

According to a first aspect of the present application, as shown in fig. 1, a multi-pet feeding method based on image recognition mainly includes:

step S1, continuously acquiring images of the feeding area of the pet.

In this step, the image of the pet feeding region, such as a camera, is continuously acquired by the image acquisition device, and the image or video of the pet feeding region can be captured for subsequent analysis. The pet feeding region may include a plurality of overlaid image acquisitions by one or more cameras.

In some alternative embodiments, step S1 further comprises:

step S11, determining the latest feeding time of each pet to be fed;

It can be appreciated that the main purpose of this embodiment is to save resources, continuously collect images of pet feeding areas through the camera, and process the collected data in the subsequent steps, which wastes larger computing resources, for this purpose, this embodiment is configured to perform image collection only when needed, that is, when the pet reaches the preset feeding time, control the camera to perform image collection, the preset feeding time can be designed according to different pets at regular time, and also determine the next feeding time according to the last feeding time and feeding interval of each pet, because this application relates to a plurality of pets, in step S11, the last feeding time of all pets needs to be obtained, then the last feeding time (a future time point) of each pet can be obtained by superimposing the feeding intervals of each pet, and then in step S12, before the last feeding time, the image collection device is started to perform continuous collection of images.

And S2, detecting a plurality of continuously input images based on a pre-trained pet detection model to obtain a pet image area containing the pet.

In step S2, the main functions of the pet detection model are: and extracting the pet image from the continuously obtained image, and after the pet image is extracted, carrying out subsequent individual pet identification, so that food can be put in according to the need.

In some alternative embodiments, step S2 further comprises:

In this embodiment, the confidence level is used to ensure that the pet image area including the pet is output, so that the subsequent step can identify the image therein, and the step S2 is to continuously process the image, and the processing step may be further set, that is, the image of the pet feeding area acquired in the step S1 is processed every set time period.

In some alternative embodiments, prior to step S21, further comprising training the pet detection model by:

L _det = λ _coord * L _coord + λ _conf * L _conf + λ _cls * L _cls ；

The pet detection model may adopt a YOLOv5 target detection algorithm, or may adopt other training models, such as models like fast R-CNN, YOLOv8, DETR, etc., unlike the existing model, in step S213, three types of loss models are introduced in the present application to calculate total loss, including coordinate regression loss formed between the pet target frame marked in the training data and the detection frame output by the model, classification loss formed between the pet category marked in the training data and the pet category output by the model, and confidence loss for indicating the presence or absence of the detection target and the accuracy of the detection frame. Training the iteration for multiple times and updating the weight of the model to reduce the loss function and ensure the convergence of the model

And S3, carrying out pet identity recognition on the pet image area based on a pre-trained pet identity recognition model, and obtaining the individual identity of the pet.

After obtaining the pet image area, the step is used for carrying out pet identification on the target area so as to distinguish different pets. And performing model training by using the pet image data set marked with the individual identity ID, and determining the most consistent individual identity of the pet according to the output feature vector after training.

It should be noted that, the pet classification introduced in step S2, such as pet dogs, pet cats, and pet birds, is aimed at promoting the accuracy of the pet image region segmentation of the pet detection model in step S2, the pet identification introduced in step S3 is specific to a specific pet, such as two pet cats, and step S3 is used for further differentiating the same.

In some alternative embodiments, step S3 further comprises:

In this embodiment, the distance is, for example, a cosine distance.

In some alternative embodiments, in step S31, model training is performed using the pet image dataset labeled with the individual identity ID, and the pet identity model is jointly optimized using cross entropy loss and triplet loss, the total loss L of the pet identity model _id The method comprises the following steps:

L _id = λ _CE * L _CE + λ _Triplet * L _Triplet ；

In this embodiment, the pet identification model may be a res net50 model, or may be a model with other backbone networks, such as VGG, inception, efficientNet, mobileNet, shuffleNet, where Cross-Entropy Loss (Cross-entry Loss) and Triplet Loss (Triplet Loss) are used to jointly optimize model parameters, so that feature vectors of the same identity are as close as possible, and feature vectors of different identities are as dispersed as possible. The training is iterated for a plurality of times and the weight of the model is updated so as to reduce the loss function and ensure the convergence of the model.

In the step, the feeding parameters comprise feeding types, meal amounts and the like, the system queries a database to obtain quantitative feeding parameters of the pets according to the identification result of the identity IDs of the pets, and the quantitative feeding of the foods is realized according to the feeding parameters.

In some alternative embodiments, step S4 further comprises:

It will be appreciated that, due to the large number of pet species, even multiple pets of the same species may be present, in order to prevent feeding crosstalk between pets, the individual pets in the pet feeding area may be continuously detected based on steps S1 to S3, and during feeding of the individual pets, other pets may be prevented from being robbed by the continuous feeding at intervals.

According to the multi-pet feeding system, timing and quantitative feeding of a plurality of pets in families can be achieved, and feeding burden of pet owners is reduced. This improvement helps promote the popularity and use of pet feeders in the home. The method can also be applied to animal husbandry breeding, such as timing and directional feeding of flocks and herds, and improves feeding efficiency.

In a second aspect of the present application, there is provided an image recognition-based multi-pet feeding device corresponding to the above method, mainly comprising:

In some alternative embodiments, the pet feeding region image acquisition module comprises:

In some alternative embodiments, the pet image region acquisition module further comprises:

In some alternative embodiments, further comprising training the pet detection model by:

L _det = λ _coord * L _coord + λ _conf * L _conf + λ _cls * L _cls ；

Wherein L is _coord 、L _conf 、L _cls Respectively coordinate regression loss, confidence loss and classification lossLoss of lambda _coord 、λ _conf 、λ _cls Weights of coordinate regression loss, confidence loss and classification loss respectively;

In some alternative embodiments, the pet identification module comprises:

In some alternative embodiments, model training is performed using a pet image dataset labeled with individual identity IDs, and a pet identification model is co-optimized using cross entropy loss and triplet loss, the total loss L of the pet identification model _id The method comprises the following steps:

L _id = λ _CE * L _CE + λ _Triplet * L _Triplet ；

In some alternative embodiments, the feeding module comprises:

In a third aspect of the present application, a computer device includes a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor executing the computer program for implementing a multiple pet feeding method based on image recognition.

In a fourth aspect of the present application, a readable storage medium stores a computer program for implementing the image recognition-based multi-pet feeding method as described above when executed by a processor. The computer-readable storage medium may be contained in the apparatus described in the above embodiment; or may be present alone without being fitted into the device. The computer readable storage medium carries one or more programs which, when executed by the apparatus, process data as described above.

Referring now to FIG. 2, a schematic diagram of a computer device 400 suitable for use in implementing embodiments of the present application is shown. The computer device shown in fig. 2 is only an example, and should not impose any limitation on the functionality and scope of use of embodiments of the present application.

As shown in fig. 2, the computer device 400 includes a Central Processing Unit (CPU) 401, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM403, various programs and data required for the operation of the device 400 are also stored. The CPU401, ROM402, and RAM403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output portion 407 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like; a storage section 408 including a hard disk or the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. The drive 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 410 as needed, so that a computer program read therefrom is installed into the storage section 408 as needed.

In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 409 and/or installed from the removable medium 411. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 401. It should be noted that, the computer storage medium of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules or units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The modules or units described may also be provided in a processor, the names of which do not in some cases constitute a limitation of the module or unit itself.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions easily conceivable by those skilled in the art within the technical scope of the present application should be covered in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A multiple pet feeding method based on image recognition, comprising:

step S1, continuously acquiring images of a pet feeding area;

2. The image recognition-based multi-pet feeding method of claim 1, wherein step S1 further comprises:

step S11, determining the latest feeding time of each pet to be fed;

3. The image recognition-based multi-pet feeding method of claim 1, wherein step S2 further comprises:

4. The image recognition-based multi-pet feeding method of claim 3, further comprising training the pet detection model prior to step S21 by:

L _det = λ _coord * L _coord + λ _conf * L _conf + λ _cls * L _cls ；

5. The image recognition-based multi-pet feeding method of claim 1, wherein step S3 further comprises:

6. The image recognition-based multiple pet feeding method of claim 5, wherein in step S31, model training is performed using a pet image dataset labeled with individual identity IDs, and a pet identity model is jointly optimized using cross entropy loss and triplet loss, the total loss L of the pet identity model _id The method comprises the following steps:

L _id = λ _CE * L _CE + λ _Triplet * L _Triplet ；

7. The image recognition-based multi-pet feeding method of claim 1, wherein step S4 further comprises:

8. A multiple pet feeding device based on image recognition, comprising:

9. The image recognition based multi-pet feeding method of claim 8, wherein the pet feeding region image acquisition module comprises:

10. The image recognition-based multi-pet feeding method of claim 8, wherein the feeding module comprises: