CN115240230A

CN115240230A - Canine face detection model training method and device, and detection method and device

Info

Publication number: CN115240230A
Application number: CN202211134199.8A
Authority: CN
Inventors: 宋程; 刘保国; 胡金有; 吴浩; 梁开岩; 郭玮鹏; 李海; 巩京京
Original assignee: Xingchong Kingdom Beijing Technology Co ltd
Current assignee: Xingchong Kingdom Beijing Technology Co ltd
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2022-10-25

Abstract

The application relates to the technical field of dog management, in particular to a method for identifying the facial features of a dog, and specifically relates to a training method, a device, a detection method and a device for a dog face detection model; the method and the device for detecting the canine face based on the training method can reduce the sample data for training, the accuracy of the detection model after training is high, the method and the device for detecting the canine face based on the training method can achieve the detection results of different purposes under different scenes, and based on the deep learning technology, the detection efficiency is high and the detection time is short compared with the traditional algorithm.

Description

Canine face detection model training method and device, and detection method and device

Technical Field

The application relates to the technical field of dog management, in particular to a method for identifying the facial features of dogs, and specifically relates to a training method, a device, a detection method and a device for dog face detection models.

Background

Target detection, also called target extraction, is an image segmentation based on target geometry and statistical characteristics, and combines the segmentation and identification of targets into one. In a complex scene, a plurality of targets are generally required to be processed in real time, and therefore, automatic target extraction and identification are particularly important. With the rapid development of computer vision technology, it is widely applied in the field of target detection, and currently, there are various methods for detecting specific targets in the prior art, such as: face detection, pedestrian detection, vehicle detection, etc., and from the development of object detection, research on face detection, pedestrian detection, vehicle detection is also the most widespread, and research on dog face detection is rare.

Dog face detection is an extremely important link in dog face identification, and also belongs to detection of a specific target. In the prior art, the detection of the dog face is mainly to detect the dog face from the image background, or to separate a sub-window of the dog face from a sub-window of a non-dog face. However, with the increasing demand and the corresponding technical development, the demand for dog face detection is not limited to distinguishing dog faces from non-dog faces in images, and identification for use in other scenes, such as identification for dog types and dog breeds, is not realized in the current prior art.

Disclosure of Invention

In order to solve the technical problems, the application provides a training method, a training device, a detection method and a detection device for a dog face detection model, which can realize the identification of a dog face in a complex environment and can also realize the identification of dog varieties in the determined dog face according to specific detection requirements.

In order to achieve the above purpose, the embodiments of the present application employ the following technical solutions:

in a first aspect, a training method for a canine face detection model comprises the following steps: acquiring a sample data set, wherein the sample data set comprises a first sample data set and a second sample data set, a plurality of sample images and a plurality of first labeling information on the sample images are configured in the first sample data set, a plurality of sample images and a plurality of second labeling information on the sample images are arranged in the second sample data set, the sample images are canine face images to be trained, the first labeling information is labeling information of characteristic points in the sample images, the second labeling information is labeling information of key characteristic points in the sample images, the characteristic points are used for representing canine features, and the key characteristic points are used for representing canine features; training an initial detection model based on the first sample data set until the initial detection model meets the requirement of network convergence, and adjusting a weight value in the initial detection model to obtain a first detection model; training the first detection model based on the second sample data set until the first detection model meets the requirement of network convergence, and adjusting the weight value in the first detection model to obtain a second detection model; and respectively storing the first detection model and the second detection model.

In a first implementation manner of the first aspect, the initial detection model is a convolutional neural network structure, and the convolutional neural network structure includes a feature extraction network and a classification network; the feature extraction network comprises a data input layer, a convolution layer and a pooling layer, and the feature mapping layer comprises a full connection layer and an output layer.

With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner, the number of the convolution layers is four.

In a third implementation manner of the first aspect, satisfying a requirement of network convergence, and adjusting a weight value in the initial detection model to obtain a first detection model includes: setting an initial learning rate, iterating the initial detection model based on the initial learning rate until a loss function is in a convergence state, and updating a weight value in the initial detection model based on random gradient descent to obtain a first detection model; satisfying the requirement of network convergence, and adjusting the weight value in the first detection model to obtain a first detection model, including: setting an initial learning rate, iterating the first detection model based on the initial learning rate until a loss function is in a convergence state, and updating a weight value in the first detection model based on random gradient descent to obtain a second detection model.

With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner, the initial learning rate is 0.0001.

In a second aspect, a training device for a canine face detection model comprises: the sample data set acquisition module is used for acquiring a sample data set; the first detection model training module is used for training the initial detection model based on the sample data set to obtain a first detection model; the second detection model training module is used for training the first detection model based on the sample number data to obtain a second detection model; and the storage module is used for respectively storing the first detection model and the second detection model.

In a first implementation manner of the first aspect, the sample data set includes a first sample data set and a second sample data set, the first sample data set is configured with a plurality of sample images and a plurality of first labeling information on the sample images, the second sample data set is provided with a plurality of sample images and a plurality of second labeling information on the sample images, the sample images are dog face images to be trained, the first labeling information is labeling information of feature points in the sample images, the second labeling information is labeling information of key feature points in the sample images, the feature points are used for characterizing dog features, and the key feature points are used for characterizing dog species features.

In a third aspect, a method for detecting a canine face, wherein a canine face detection model trained based on the training method for a canine face detection model according to any one of the preceding claims, comprises: acquiring a detection command, determining a detection strategy, and determining a model to be detected based on the detection strategy; acquiring image data to be detected; identifying the image data to be detected based on the model to be detected to obtain a detection result; the model to be detected comprises a first detection model and a second detection model.

In a first implementation manner of the third aspect, the first detection model, the second detection model, and the detection policy are configured with command tags.

In a fourth aspect, a canine face detection apparatus comprises: the detection model determining module is used for determining a detection strategy based on the detection command and determining a model to be detected based on the detection strategy; the image data acquisition module is used for acquiring image data to be detected; and the detection result acquisition module is used for identifying the image data to be detected based on the model to be detected to obtain a detection result.

The technical scheme provided by the embodiment of the application comprises a detection model training method and device and a detection method and detection device configured based on the detection model training method and device. According to the detection model training method and device, training of different detection models can be performed based on the same sample data set, detection for different purposes under different scenes can be obtained, the training of the sample data can be reduced through the training method and the training device, the accuracy of the detection model after training is high, the training method can be configured based on a GPU, and the training time is shortened. In the method for detecting the canine face provided by the embodiment of the application, the targeted detection and identification are performed based on the configured first detection model and the second detection model, in the embodiment, the canine information existing in the image data is determined through the first detection model, and the breed of the canine information is identified through the second detection model. In addition, the detection method and the detection device in the present application reduce the amount of calculation for detection, and the first detection model and the second detection model are obtained based on a deep learning method, so that the accuracy of the detection result is high and the processing efficiency is high.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

The methods, systems, and/or programs of the figures will be further described in accordance with the exemplary embodiments. These exemplary embodiments will be described in detail with reference to the drawings. These exemplary embodiments are non-limiting exemplary embodiments in which example numbers represent similar mechanisms throughout the various views of the drawings.

Fig. 1 is a schematic structural diagram of a terminal device provided in an embodiment of the present application.

Fig. 2 is a flow diagram of a canine face detection model training method shown in some embodiments of the present application.

Fig. 3 is a block diagram of a canine face detection model training apparatus according to some embodiments of the present application.

Fig. 4 is a flowchart of a canine face detection method provided in an embodiment of the present application.

Fig. 5 is a block schematic diagram of a canine face detection apparatus according to some embodiments of the present application.

Detailed Description

In order to better understand the technical solutions, the technical solutions of the present application are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, and are not limitations of the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.

In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant guidance. It will be apparent, however, to one skilled in the art that the present application may be practiced without these specific details. In other instances, well-known methods, procedures, systems, compositions, and/or circuits have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present application.

Flowcharts are used herein to illustrate the implementations performed by systems according to embodiments of the present application. It should be expressly understood that the processes performed by the flowcharts may be performed out of order. Rather, these implementations may be performed in the reverse order or simultaneously. In addition, at least one other implementation may be added to the flowchart. One or more implementations may be deleted from the flowchart.

Before further detailed description of the embodiments of the present invention, terms and expressions referred to in the embodiments of the present invention are described, and the terms and expressions referred to in the embodiments of the present invention are applicable to the following explanations.

(1) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

(2) Based on the condition or state on which the operation to be performed depends, when the condition or state on which the operation depends is satisfied, the operation or operations to be performed may be in real time or may have a set delay; there is no restriction on the order of execution of the operations performed unless otherwise specified.

(3) Convolutional neural networks, which are mathematical or computational models that mimic the structure and function of biological neural networks (the central nervous system of animals, particularly the brain) are used to estimate or approximate functions.

(4) In the deep learning, the intrinsic rules and the representation levels in the sample data are obtained based on the convolutional neural network and the sample data, and data such as characters, images and sounds can be identified.

The embodiment of the invention provides a detection model training method and a detection method, which relate to the field of Artificial Intelligence (AI), and the AI technology is a comprehensive subject, and relates to the field of wide technology, namely the technology at the hardware level and the technology at the software level. The artificial intelligence base technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

According to the technical scheme, the main application scene is that the identification is carried out on the dog, the identification aiming at the dog can be carried out through the whole body characteristics of the dog, the difference between the whole body type and other animals aiming at the dog is small, and the body characteristics of the dog are not prominent, so that the detection of the dog is not ideal and accurate by carrying out the identification based on the whole body characteristics of the dog. Therefore, the management for the dog is particularly directed to the detection and recognition for the dog, which are mainly based on the face of the dog, and the main purpose of the detection and recognition for the dog is to extract an image corresponding to the face feature of the dog from complex image information so as to determine the dog information in the image information. For this logic, images corresponding to the dog's face were obtained. In some complex use and management scenes, for example, for fine management of dogs, in order to acquire more information of dogs, deep identification needs to be performed on the category of the dogs, that is, identification needs to be performed on the information of the dogs in the image information and the corresponding variety and type of the dogs, and the purpose of acquiring such information is mainly for subsequent data use and analysis. The current dog detection method and the dog detection model for realizing the dog detection method can not realize the judgment and identification of the types of the dogs.

Based on the above technical background, the present embodiment provides a terminal device 100, which includes a memory 110, a processor 120 and a computer program stored in the memory and executable on the processor, wherein the processor executes a dog face detection method and a dog face detection model training method, and can realize the detection of the dog face and the training of the dog face detection model. In this embodiment, the terminal device is in communication with a user terminal and a platform terminal, wherein the user terminal is configured to receive information of a detection result, and the platform terminal is configured to obtain a trained dog face detection model. The method for sending the information is realized based on a network, and before the terminal device applies, an association relation needs to be established between the user terminal and the terminal device, and the association between the terminal device and the user terminal can be realized through a registration method. The terminal device can be directed to a plurality of clients or a client, and the client communicates with the terminal device through a password and other encryption modes. In this embodiment, two independent regions are set for the terminal device, one region runs the dog face detection method, and the other region runs the dog face detection model training method.

In this embodiment, the terminal may be a server, and includes a memory, a processor, and a communication unit with respect to a physical structure of the server. The memory, processor and communication unit components are electrically connected to each other, directly or indirectly, to enable data transfer or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory is used for storing specific information and programs, and the communication unit is used for sending the processed information to the corresponding user side.

In the embodiment, the storage module is divided into two storage areas, wherein one storage area is a program storage unit, and the other storage area is a data storage unit. The program storage unit is equivalent to a firmware area, the read-write authority of the area is set to be a read-only mode, and data stored in the area cannot be erased and changed. And the data in the data storage unit can be erased or read and written, and when the capacity of the data storage area is full, the newly written data can overwrite the earliest historical data.

The Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.

The processor may be an integrated circuit chip having signal processing capabilities. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP)), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Referring to fig. 4, in the present embodiment, a method for detecting a canine face is provided, which includes the following steps:

and S410, acquiring a detection command, determining a detection strategy, and determining a model to be detected based on the detection strategy.

In this embodiment, the detection command includes two modes for issuing, the first mode is active issuing, and the other mode is issuing a command based on a set issuing rule. The first mode is to consider command issuing and then initiate a command, and the other mode is to set an issuing rule in advance, and a time period can be set for issuing the command. And the detection commands comprise two types, and corresponding detection strategies are determined based on the two types of commands, wherein one type of command is to detect whether dogs exist, and the other type of command is the corresponding dog types of the dogs. And acquiring corresponding detection strategies aiming at the two detection commands, wherein one detection strategy is used for detecting whether dogs exist, and the other detection strategy is used for detecting the corresponding types of the dogs. In this embodiment, the detection processes implemented for the above two detection strategies are detected based on the configured detection models, and corresponding different detection models are configured for the above two detection strategies, respectively, so as to achieve the acquisition of different results.

In this embodiment, the models to be detected are a first detection model and a second detection model, where the first detection model is used to detect whether a dog exists, and the second detection model is used to detect the type of the dog.

In this embodiment, the first detection model, the second detection model, and the detection policy are configured with command tags, the command tags are obtained based on a delivered command, that is, a corresponding command tag is extracted based on the delivered command, a corresponding command policy is determined through the command tag, the first detection model or the second detection model corresponding to the command tag is determined based on the command policy, and detection is implemented based on the determined first detection model and the determined second detection model.

And S420, acquiring image data to be detected.

In this embodiment, the detection logic acquires information corresponding to the detection purpose based on the image, and for this step, mainly based on the acquisition of image data to be detected. And the acquisition mode aiming at the image data to be detected can be acquired based on an externally configured camera.

And S430, identifying the image data to be detected based on the model to be detected to obtain a detection result.

In this embodiment, the acquired image data to be detected is detected based on the first detection model or the second detection model acquired in step S410.

The first detection model and the second detection model are mainly used for extracting images meeting characteristic requirements in image data to be detected so as to perform identification detection.

The detection methods for the first detection model and the second detection model are respectively described in detail, and the detection processing of the image data to be detected by the first detection model comprises the following steps:

inputting the image data into a first detection model, processing the image data based on the first detection model to obtain the probability that the image data belongs to the dog, and judging that the image data does not belong to the dog when the probability obtained by processing is smaller than a probability threshold value based on a preset probability threshold value; and when the probability obtained by processing is equal to or larger than the probability threshold value, judging that the image belongs to the dog.

The method for detecting and processing the image data to be detected by aiming at the second detection model comprises the following steps:

inputting the image data into a second detection model, processing the image data based on the second detection model to obtain a plurality of probability values of the image data belonging to specific dog types, judging that the image data belongs to preset dog type information based on a preset probability threshold value when the probability obtained by processing is greater than or equal to the probability threshold value, and comparing the maximum probability value in the probability values to judge that the corresponding dog only belongs to the dog type corresponding to the maximum probability value.

In another case, when the probability obtained by the processing is smaller than the probability threshold, it is determined that the image does not belong to the preset dog model information, and in this case, it is described that there is a dog model which is not included in the second detection model, the corresponding type is determined based on the artificial judgment of the dog, and the second detection model is trained again based on the image information related to the artificially determined type, so as to obtain the trained second detection model.

The present embodiment is directed to the identified result being sent to the corresponding command issuer, and the result may include the identified corresponding information. The method can also be configured in a system for intelligently keeping the dog for identifying the dog in different scenes.

As for the configuration of this method, a device configuration that can virtualize a plurality of processes implemented in the process can be provided, and referring to fig. 5, this embodiment provides a canine face detection device 500 for this configuration mode, including: a detection model determining module 510, configured to determine a detection policy according to the detection command, and determine a model to be detected based on the detection policy. An image data obtaining module 520, configured to obtain image data to be detected. A detection result obtaining module 530, configured to identify the image data to be detected based on the model to be detected, so as to obtain a detection result.

According to the method and the device for detecting the canine face, the targeted detection and identification are performed based on the configured first detection model and the second detection model, in the embodiment, the canine information existing in the image data is determined through the first detection model, and the breed of the canine information is identified through the second detection model. In addition, the detection method and the detection device in the present application reduce the amount of calculation for detection, and the first detection model and the second detection model are obtained based on a deep learning method, so that the accuracy of the detection result is high and the processing efficiency is high.

The first detection model and the second detection model used in steps S410 to S430 are trained based on corresponding training methods, and the training method is performed with reference to fig. 2, which includes the following processes:

the embodiment provides a training method of a dog face detection model, which comprises the following steps:

and step S210, acquiring a sample data set.

In this embodiment, the model training method is mainly based on a sample data set and training is performed based on a preset initial model until a training result meets a preset requirement, and a weight value in the initial model is adjusted based on the fact that the training result meets the preset requirement, so that a target detection model is obtained.

For this step S210, the method is mainly used for acquiring basic data, i.e. the sample data set. In this embodiment, the obtaining of the sample data set may be based on manually acquired data, and may further include collecting the sample data set corresponding to the whole network in a large data capture manner based on a requirement for the sample data set in a training scene. The sample data sets are different according to different use scenes, different detection requirements and different detection scenes, the sample data sets are labeled according to training use requirements of the sample data sets, and the data in the labeled sample data sets are used for expressing the properties of the corresponding data and explaining the properties of the data.

The sample in the sample data set in this embodiment includes a first sample data set and a second sample data set, the first sample data set is configured with a plurality of sample images and a plurality of first labeling information on the sample images, the second sample data set is provided with a plurality of sample images and a plurality of second labeling information on the sample images, the sample images are dog face images to be trained, the first labeling information is labeling information of feature points in the sample images, the second labeling information is labeling information of key feature points in the sample images, the feature points are used for characterizing dog features, and the key feature points are used for characterizing dog species features.

In the present embodiment, since the first detection model and the second detection model are arranged for the purpose of detection, the basic data to be used in the first detection model and the second detection model are the same, but the label information in the basic data differs between the detection models, and the basic data in the present embodiment is a plurality of sample images, that is, acquired basic images.

And S220, training an initial detection model based on the first sample data set until the initial detection model meets the requirement of network convergence, and adjusting a weight value in the initial detection model to obtain a first detection model.

In this embodiment, an initial detection model is trained for an acquired sample data set to obtain a target detection model, and the initial detection model in this embodiment is a convolutional neural network, and for a detection scenario in this embodiment, a structure of the convolutional neural network includes a feature extraction network and a classification network; the feature extraction network comprises a data input layer, a convolution layer and a pooling layer, and the feature mapping layer comprises a full connection layer and an output layer. Wherein the number for convolutional layers is four.

The convolutional neural network structure in the embodiment comprises a feature extraction network and a classification network, wherein the feature extraction network comprises a data input layer, a convolutional layer and a pooling layer, a feature mapping layer mainly comprises a full-connection layer and an output layer, and the 4 convolutional layers are used for extracting the canine facial features from low to high. In the network, two different pooling layers are provided, namely an integrated pooling layer and a spatial pyramid pooling layer, wherein the integrated pooling layers are respectively placed behind the first two convolution layers, an SPP layer in the pyramid pooling layer is placed in front of the last convolution layer and the convolution layer, and comprises 1x1,2x2 and 3x3 pooling windows, and a splicing layer is also arranged behind the SPP layer and is used for splicing the output of 3 pooling windows; the classification network is composed of 3 fully-connected layers, mainly maps a feature space to a plurality of discrete labels, and has two output layers, namely a processing layer for performing normalization processing and loss function processing and an Accuracy layer, wherein the normalization processing and the processing based on the loss function are used for calculating loss in the network and for back propagation, the Accuracy layer is used for calculating the Accuracy of a verification set, and the activation functions of the convolutional neural network all use a PReLU function in the embodiment.

Aiming at meeting the requirement of network convergence and adjusting the weight value in the initial detection model to obtain a first detection model, the method comprises the following steps:

setting an initial learning rate, iterating the initial detection model based on the initial learning rate until a loss function is in a convergence state, and updating a weight value in the initial detection model based on random gradient descent to obtain a first detection model.

Step S230, training the first detection model based on the second sample data set until the initial detection model meets the requirement of network convergence, and adjusting a weight value in the first detection model to obtain a second detection model, wherein the step includes the following steps:

setting an initial learning rate, iterating the first detection model based on the initial learning rate until a loss function is in a convergence state, and updating a weight value in the first detection model based on random gradient descent to obtain a second detection model.

And S240, storing the first detection model and the second detection model.

In this embodiment, the initial learning rate is set to 0.0001, each ten thousand iterations, the initial learning rate is multiplied by 0.1, the weight values in the network are updated by using a random gradient descent method, and the batch size of the convolutional neural network is set to 128 according to the size of the video memory, that is, 128 samples are taken from the training set for training each time.

In this embodiment, a first detection model and a second detection model satisfying the reddest result are obtained by respectively training the first detection model and the second detection model, where basic data of sample data sets used for the first detection model and the second detection model are the same, and labeling in different labeling modes is performed on the basic data, and training is respectively performed on the basis of data labeled in different modes. And training by using second labeled data based on the first detection model to obtain a trained second detection model, wherein the second detection model is used for detecting the dog breed information in the image data.

The detection model training method provided in this embodiment may be configured by a virtual device, and the virtual device after configuration is the training device 300 for the dog face detection model, including:

a sample data set obtaining module 310, configured to obtain a sample data set. The first detection model training module 320 is configured to train the initial detection model based on the sample data set to obtain a first detection model. The second detection model training module 330 is configured to train the first detection model based on the sample number data to obtain a second detection model. A saving module 340, configured to save the first detection model and the second detection model respectively.

The detection model training method and the training device provided by the embodiment can be used for training different detection models based on the same sample data set to obtain detection for different purposes in different scenes, the training method and the training device can be used for reducing the sample data for training, the accuracy of the detection model after training is high, and the training method can be configured based on a GPU to shorten the training time.

The training method and the detection method provided in this embodiment are eliminated. The apparatus provided in this embodiment may also be based on a computer product embodied in at least one computer-readable medium, where the product includes computer-readable program code.

A computer readable signal medium may comprise a propagated data signal with computer program code embodied therein, for example, on a baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, and the like, or any suitable combination. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code on a computer readable signal medium may be propagated over any suitable medium, including radio, electrical cable, fiber optic cable, RF, or the like, or any combination of the preceding.

Computer program code required for the execution of aspects of the present application may be written in any combination of one or more programming languages, including object oriented programming, such as Java, scala, smalltalk, eiffel, JADE, emerald, C + +, C #, VB.NET, python, and the like, or similar conventional programming languages, such as the "C" programming language, visual Basic, fortran 2003, perl, COBOL 2002, PHP, ABAP, dynamic programming languages, such as Python, ruby, and Groovy, or other programming languages. The programming code may execute entirely on the user's computer, as a stand-alone software package, partly on the user's computer, partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any form of network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service using, for example, software as a service (SaaS).

It should be understood that, for technical terms which are not noun-matically explained above, the skilled person can easily deduce from the above disclosure that the meaning of the description is not limited herein.

The skilled person can determine some preset, reference, predetermined, set and preference labels without any doubt based on the above disclosure, such as threshold, threshold interval, threshold range, etc. For some technical characteristic terms which are not explained, the technical solution can be clearly and completely implemented by those skilled in the art by reasonably and unambiguously deriving the technical solution based on the logical relations in the previous and following paragraphs. The prefixes of unexplained technical feature terms, such as "first," "second," "example," "target," and the like, may be unambiguously derived and determined from the context. Suffixes of technical feature terms not explained, such as "set", "list", etc., can also be derived and determined unambiguously from the preceding and following text.

The above disclosure of the embodiments of the present application will be apparent to those skilled in the art from the above description. It should be understood that the process of deriving and analyzing technical terms, which are not explained, by those skilled in the art based on the above disclosure is based on the contents described in the present application, and thus the above contents are not an inventive judgment of the overall scheme.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered as illustrative and not restrictive of the application. Various modifications, adaptations, and alternatives may occur to one skilled in the art, though not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.

Also, this application uses specific terminology to describe embodiments of the application. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means a feature, structure, or characteristic described in connection with at least one embodiment of the application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, certain features, structures, or characteristics may be combined as suitable in at least one embodiment of the application.

In addition, those skilled in the art will recognize that the various aspects of the application may be illustrated and described in terms of several patentable species or contexts, including any new and useful combination of procedures, machines, articles, or materials, or any new and useful modifications thereof. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as a "unit", "component", or "system".

Additionally, the order of the process elements and sequences described herein, the use of numerical letters, or other designations are not intended to limit the order of the processes and methods unless otherwise indicated in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it should be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware means, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

It should also be appreciated that in the foregoing description of embodiments of the present application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of at least one embodiment of the invention. This method of disclosure, however, is not intended to imply that more features are required than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Claims

1. A training method for a dog face detection model is characterized by comprising the following steps:

acquiring a sample data set, wherein the sample data set comprises a first sample data set and a second sample data set, a plurality of sample images and a plurality of first labeling information on the sample images are configured in the first sample data set, a plurality of sample images and a plurality of second labeling information on the sample images are arranged in the second sample data set, the sample images are canine face images to be trained, the first labeling information is labeling information of feature points in the sample images, the second labeling information is labeling information of key feature points in the sample images, the feature points are used for representing canine characteristics, and the key feature points are used for representing canine type characteristics;

training an initial detection model based on the first sample data set until the initial detection model meets the requirement of network convergence, and adjusting a weight value in the initial detection model to obtain a first detection model;

training the first detection model based on the second sample data set until the first detection model meets the requirement of network convergence, and adjusting a weight value in the first detection model to obtain a second detection model;

and respectively storing the first detection model and the second detection model.

2. The canine face detection model training method of claim 1, wherein the initial detection model is a convolutional neural network structure comprising a feature extraction network and a classification network; the feature extraction network comprises a data input layer, a convolution layer and a pooling layer, and the feature mapping layer comprises a full connection layer and an output layer.

3. The canine face detection model training method of claim 2, wherein the number of convolutional layers is four.

4. The training method of canine face detection model according to claim 1, wherein satisfying the requirement of network convergence and adjusting the weight values in the initial detection model to obtain the first detection model comprises:

setting an initial learning rate, iterating the initial detection model based on the initial learning rate until a loss function is in a convergence state, and updating a weight value in the initial detection model based on random gradient descent to obtain a first detection model;

satisfying the requirement of network convergence, and adjusting the weight value in the first detection model to obtain a first detection model, including:

5. The canine face detection model training method of claim 4, wherein the initial learning rate is 0.0001.

6. The utility model provides a dog class face detection model trainer which characterized in that includes:

the sample data set acquisition module is used for acquiring a sample data set;

the first detection model training module is used for training the initial detection model based on the sample data set to obtain a first detection model;

the second detection model training module is used for training the first detection model based on the sample number data to obtain a second detection model;

and the storage module is used for respectively storing the first detection model and the second detection model.

7. The training device for a canine face detection model according to claim 6, wherein the sample data set includes a first sample data set and a second sample data set, the first sample data set has a plurality of sample images and a plurality of first labeling information on the sample images, the second sample data set has a plurality of sample images and a plurality of second labeling information on the sample images, the sample images are images of canine faces to be trained, the first labeling information is labeling information of feature points in the sample images, the second labeling information is labeling information of key feature points in the sample images, the feature points are used for characterizing canine features, and the key feature points are used for characterizing canine features.

8. A dog face detection method, which is characterized in that a dog face detection model trained based on the dog face detection model training method of any one of claims 1 to 5 comprises the following steps:

acquiring a detection command, determining a detection strategy, and determining a model to be detected based on the detection strategy;

acquiring image data to be detected;

identifying the image data to be detected based on the model to be detected to obtain a detection result;

the model to be detected comprises a first detection model and a second detection model.

9. The canine face detection method of claim 8, wherein the first detection model, the second detection model, and the detection strategy are configured with command tags.

10. A canine face detection device, comprising:

the detection model determining module is used for determining a detection strategy based on the detection command and determining a model to be detected based on the detection strategy;

the image data acquisition module is used for acquiring image data to be detected;

and the detection result acquisition module is used for identifying the image data to be detected based on the model to be detected to obtain a detection result.