CN111428769A

CN111428769A - Artificial intelligence translation system for designing pet behavior language by software

Info

Publication number: CN111428769A
Application number: CN202010190580.0A
Authority: CN
Inventors: 周升志; 邓海英; 邓海琳; 黄解珠; 段凯文; 廖健慧; 江振东
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-03-18
Filing date: 2020-03-18
Publication date: 2020-07-17

Abstract

The invention relates to an artificial intelligence translation system for designing pet behavior language by software, which belongs to the field of software design and artificial intelligence and comprises the following parts: the system comprises a user side, a data processing side, a classification processing side, an information summarizing side and a result expressing side. The system for intelligently analyzing the pet behavior language by adopting the convolutional neural network technology can realize more efficient and accurate recognition and translation of the pet behavior language by adopting a classification processing technology of a Principal Component Analysis (PCA), a deep belief neural network (DBN) and a SoftMax function.

Description

Artificial intelligence translation system for designing pet behavior language by software

Technical Field

The invention relates to an artificial intelligence translation system for designing pet behavior language by software, belonging to the field of software design and artificial intelligence.

Background

Aiming at the research of pet behaviours, domestic research starts later than abroad, the research on pet behaviours is started in recent years, and at present, artificial intelligence research patents and software products of pet behavioural languages are not available at home and abroad, so the invention has originality. The research of the pet behavior language has very important significance for deepening communication between a pet owner and a pet, pet training and diagnosis and treatment of pet psychological diseases, but the pet behavior language is developed late in China, has poor theoretical basis and high learning difficulty, and limits the application of the pet behavior language in life and clinical diagnosis and treatment; secondly, the difficulty of applying the computer technology to pet behavior languages is high, and the traditional artificial intelligence technology needs a large amount of video databases or picture databases and is difficult to realize intellectualization, so that artificial intelligence software of relevant pet behavior languages is not available at home and abroad.

The current common neural network identification method at home and abroad is 2DCNN, which mainly cuts a video image into each frame and then uses 2DCNN neural network to identify motion information interaction between frames under the time dimension of a model, and has the defect of low identification accuracy; the convolution layer of the convolutional neural network technology is characterized in that images of three continuous frames are selected for convolutional identification, single-frame image identification of 2DCNN is replaced, connection between each feature map and the previous frame image is guaranteed, motion information is captured, accuracy is high, and the technology is advanced.

Because the audio signals have the characteristics of high dimensionality and high redundancy, in a traditional audio emotion recognition model, audio processing is often required to be performed on audio, appropriate audio features are extracted according to actual needs, and finally model training is performed on the extracted features through modeling, so that the effect of audio emotion recognition is finally achieved, and therefore the recognition accuracy is low, the efficiency is low, and practical application is difficult.

Disclosure of Invention

Based on the above mentioned problems mentioned in the background, the present invention is intended to provide an artificial intelligence translation system for pet behavioral language to solve the above mentioned problems.

The system of the invention is realized by the following steps:

the method comprises the following steps that firstly, a user shoots a video of action behaviors of dogs and cats for a limited time or selects a shot video of the behavior of dogs and cats from an album according to software operation prompts by using an intelligent terminal such as a mobile phone and the like, and submits the video for uploading; in the second step, the data processing end mainly has three functions, and the sequence is not in sequence. Firstly, extracting organ change representations of eyes, a nose, ears, tongues, teeth, lips and tails of uploaded videos according to set organ specific points; secondly, extracting the characteristics of sitting, standing, lying, creeping and jumping changes according to the set special points of the movement; finally, according to the set sound special points, extracting the representation of pitch, tone, time length and time frequency change; and thirdly, the classification processing end is a curing model database which comprises three databases of an organ curing model database, a motion model database and a sound model database. Matching the characterization of organs, behaviors and sounds collected by the data processing end with a curing model database of the classification processing end according to a set idiosyncratic point analysis method to obtain an operation result of the curing model database, wherein the operation result is segment type information; fourthly, the information summarizing end summarizes the large amount of fragment type information, deletes abrupt and contradictory vocabularies, and establishes vocabulary logic to make sentence expressions to obtain processing results; fifthly, the result expression end correspondingly matches the male voice, the female voice and the child voice modes according to the age and sex information of the pet provided by the user registration, and carries out voice expression on the processing result of the information summarizing end.

The technical scheme of the invention is as follows:

an artificial intelligence translation system for designing pet behavior language by software comprises the following parts: user end, data processing end, classification processing end, information summarizing end and result expression end

The user side is as follows: the system is arranged on intelligent terminals such as mobile phones, tablet computers and computers in the form of mobile phone APP, WeChat public numbers and computer software, and a user correspondingly provides information related to the species, age, sex and living environment and living habits of pets according to software registration operation prompts; and the user shoots videos of the behavior of the dog and the cat for a limited time by using an intelligent terminal such as a mobile phone and the like according to the software operation prompt or selects the shot videos of the behavior of the dog and the cat from the photo album, and submits the videos to be uploaded as video information.

The data processing end: carrying out classification processing through an artificial intelligence technology of a convolutional neural network, a principal component analysis method, a deep belief neural network and a SoftMax function, extracting change representations of organs, sounds and behaviors, and establishing a generation analysis object, which is specifically embodied in that: extracting organ change representations of the head, eyes, nose, ears, tongue, teeth, lips and tail of the uploaded video according to set organ specific points; extracting the representations of sitting, standing, lying, creeping and jumping changes of the uploaded video according to set movement special points; and extracting representations of pitch, tone, time length and time frequency change of the uploaded video according to the set sound special points.

The classification processing end is as follows: and the classification processing end is a curing model database and comprises three databases of an organ curing model database, a motion model database and a sound model database, organ, behavior and sound characteristics collected by the data processing end are used as processing objects, and the curing model database of the classification processing end is matched according to a set singularity analysis method to obtain an operation result of the curing model database.

The information summarizing end: the information collecting end is a language processing module, reprocessing the operation result of the solidification model database of the classification processing end, establishing a logic relation and expressing the logic relation through sentence making, the operation result of the solidification model database is fragment type information, the information collecting end collects the large amount of fragment type information, deletes abrupt and contradictory words and words, and establishes word logic to express the sentence making to obtain a processing result.

The result expression end: the result expression end is a language matching module, and correspondingly matches male voice, female voice and child voice modes according to the pet type, age and gender information provided by the user end registration, and carries out voice expression on the processing result of the information summarizing end.

The establishment of the organ solidification model database and the motion model database is realized by the artificial intelligence technology of a convolutional neural network; the acoustic model database is implemented by using Principal Component Analysis (PCA), deep belief neural network (DBN) and SoftMax functions for classification processing.

The artificial intelligence technology of the convolutional neural network specifically comprises the following steps:

(1) firstly, a convolutional layer of a convolutional neural network selects images of three continuous frames to carry out convolutional image recognition, and each characteristic map is ensured to be connected with the image of the previous frame so as to capture motion information;

(2) taking the video continuous frame image with the size of 60 × 40 as an input layer element, and simultaneously performing convolution operation by using a 3D convolution sum with the size of 7 × 3, wherein 7 × 7 identifies a space dimension, and 7 × 3 represents a convolution kernel using three space dimensions;

(3) then adopting 23 × 2max boosting operation to perform down sampling on the data;

(4) and finally, performing convolution operation through a convolution kernel of 7 × 6 × 3, wherein the convolution operation value is very small, so that the data is operated by adopting a down-sampling layer with the size of 7 × 4, and finally, the set organ behaviors and action behaviors of the animal are identified.

The principal component analysis method, the deep belief neural network and the SoftMax function specifically comprise the following steps

(1) Performing audio identification through a short-time average energy algorithm;

(2) thirdly, carrying out dimensionality reduction on redundant noise of the audio data by a Principal Component Analysis (PCA);

(3) inputting the processed audio data into a deep belief neural network (DBN) for training;

(4) and finally, calculating the audio data through a SoftMax function classification processing technology, thereby completing audio emotion recognition.

The pet species that can be analyzed include dogs and cats.

The invention has the beneficial effects that:

(1) the invention adopts the first case at home and abroad to realize a system for intelligently analyzing pet behavior languages by adopting a convolutional neural network technology, and a convolutional layer of the convolutional neural network technology selects images of three continuous frames for convolutional identification to replace single-frame image identification of 2DCNN, so that each characteristic map can be connected with the previous frame image to capture motion information, the accuracy is higher, and the technology is more advanced.

(2) The invention adopts the first case of home and abroad to adopt a Principal Component Analysis (PCA), a deep belief neural network (DBN) and a SoftMax function to carry out classification processing technology to realize a system for intelligently analyzing pet behavior language, gradually and autonomously learns the signal characteristics, continuously optimizes the characteristic of adjusting parameters and can realize more efficient and more accurate voice recognition.

(3) The implementation of artificial intelligence by pet behavior language has five main benefits: firstly, the learning difficulty of pet behaviours is high, most people are difficult to learn and apply, and the intellectualization is realized, so that the application threshold can be reduced; secondly, the user only needs to record a video to the pet by using a multimedia terminal such as a mobile phone, a tablet, a computer and the like, so that the language expressed by the behavior of the pet can be quickly identified, and real artificial intelligence is realized; thirdly, the communication between the breeder and the pet is facilitated, the breeder is helped to know the expression of the pet immediately, and the feelings of the two parties are improved; the pet training work is facilitated, most of ordinary feeders can be helped to train the pet, and the threshold difficulty of pet training is reduced; and because the pet psychological diagnosis and treatment aiming at the pet is less in China and the accuracy is low, the pet psychological diagnosis and treatment and the pet non-positive behavior correction diagnosis and treatment are intelligently facilitated for pet doctors, and the accuracy of the psychological disease diagnosis is facilitated to be improved.

Drawings

FIG. 1 is a schematic diagram of the operation of the system of the present invention.

Fig. 2 is a schematic operation diagram of the data processing terminal according to the present invention.

Fig. 3 is a schematic diagram of the classification processing end of the present invention.

FIG. 4 is a schematic roadmap for convolutional neural network technology

Fig. 5 is a technical route diagram of the sound extraction process.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

A user uses an intelligent terminal such as a mobile phone and the like to shoot a section of video of 15s for behavior actions of a dog, and submits the video for uploading; the data processing end extracts the organ representation with changes according to the set organ specific points of the uploaded video, for example: squinting, licking the mouth and nose, lying down the ears, lying in a semi-lying posture and scratching the body; the solidification model database at the classification processing end obtains the fragment type information of the change, such as squinting corresponding contradiction, evasion and error identification; licking the mouth and nose, showing love, error and pressure; the groveling ears correspond to fear, fear and obedience; half lying corresponds to comfort, invitation, obedience, contradiction; scratching the body corresponds to contradiction, obedience, weakness and the like; the information collecting end deletes the fragment information and matches the information, deletes the abrupt and contradictory words, retains the similar or same words, establishes the emotional logic to make sentences to express meaning, obtains the dog's emotional conflict, and wants to be punished and displayed wrongly; the result expression end carries out voice expression by matching the expression meaning information with a corresponding voice mode, such as 'owner, I know that you are misled to forgive but cannot beat I'! ".

Example 2

A user uses an intelligent terminal such as a mobile phone and the like to shoot a section of 15s of video for behavior actions of the cat, and submits the video for uploading; the data processing end extracts the organ change characterization with changes, for example: groveling ears, direct vision, standing tail, standing posture, arching waist and back and sharp sound; matching the solidification model database of the classification processing end with changed organ representations such as groveling corresponding fear, fear and obedience; direct vision corresponds to alertness, governance, and the like; tail erection represents governance, attack, threat, etc.; the sound sharp represents fear, threat, scare, expectation, pain and the like, and the fragment type information of the change is obtained; the information collecting end deletes the fragment information and matches the information, and an emotional logic is established to make sentences to express meanings, so that the emotion of fear in the cat is obtained, behavior expression is that sound is threatened, and threatening attack is wanted; the result expression end carries out voice expression by matching the expression meaning information with the corresponding voice mode, such as "I is afraid of, don't go over! Then I will bite you! ".

The user uploads videos of behavior actions of the dog or the cat by using an intelligent terminal such as a mobile phone and the like, and obtains corresponding emotional sound expression and character information through translation processing of an artificial intelligent translation system of the pet behavior language, so that the user can understand the expression of the pet to the owner without any obstacle, and the communication obstacle between the user and the pet is eliminated.

Claims

1. An artificial intelligence translation system for designing pet behavior language by software is characterized by comprising the following parts: the system comprises a user side, a data processing side, a classification processing side, an information summarizing side and a result expressing side.

2. The user end according to claim 1, wherein: the system is arranged on intelligent terminals such as mobile phones, tablet computers and computers in the form of mobile phone APP, WeChat public numbers and computer software, and a user correspondingly provides information related to the species, age, sex and living environment and living habits of pets according to software registration operation prompts; and the user shoots videos of the behavior of the dog and the cat for a limited time by using an intelligent terminal such as a mobile phone and the like according to the software operation prompt or selects the shot videos of the behavior of the dog and the cat from the photo album, and submits the videos to be uploaded as video information.

3. The data processing terminal according to claim 1, wherein: carrying out classification processing through an artificial intelligence technology of a convolutional neural network, a principal component analysis method, a deep belief neural network and a SoftMax function, extracting change representations of organs, sounds and behaviors, and establishing a generation analysis object, which is specifically embodied in that: extracting organ change representations of the head, eyes, nose, ears, tongue, teeth, lips and tail of the uploaded video according to set organ specific points; extracting the representations of sitting, standing, lying, creeping and jumping changes of the uploaded video according to set movement special points; and extracting representations of pitch, tone, time length and time frequency change of the uploaded video according to the set sound special points.

4. The classification processing terminal according to claim 1, wherein: and the classification processing end is a curing model database and comprises three databases of an organ curing model database, a motion model database and a sound model database, organ, behavior and sound characteristics collected by the data processing end are used as processing objects, and the curing model database of the classification processing end is matched according to a set singularity analysis method to obtain an operation result of the curing model database.

5. The information summarizing terminal of claim 1, wherein: the information collecting end is a language processing module, reprocessing the operation result of the solidification model database of the classification processing end, establishing a logic relation and expressing the logic relation through sentence making, the operation result of the solidification model database is fragment type information, the information collecting end collects the large amount of fragment type information, deletes abrupt and contradictory words and words, and establishes word logic to express the sentence making to obtain a processing result.

6. The result expression terminal according to claim 1, wherein: the result expression end is a language matching module, and correspondingly matches male voice, female voice and child voice modes according to the pet type, age and gender information provided by the user end registration, and carries out voice expression on the processing result of the information summarizing end.

7. The establishment of the organ solidification model database and the motion model database according to claim 4 is realized by an artificial intelligence technology of a convolutional neural network; the acoustic model database is implemented by using Principal Component Analysis (PCA), deep belief neural network (DBN) and SoftMax functions for classification processing.

8. An artificial intelligence technique for a convolutional neural network as claimed in claim 3 or claim 7, characterized by the steps of:

9. The principal component analysis method, the deep belief neural network, and the SoftMax function according to claim 3 or claim 7, characterized in that it specifically consists of (1) performing audio recognition by a short-time average energy algorithm;

10. Use of the artificial intelligence translation system for a software designed pet behavioral language according to claim 1, wherein: the pet species that can be analyzed include dogs and cats.