CN111800455A

CN111800455A - Method for sharing convolutional neural network based on different host data sources in local area network

Info

Publication number: CN111800455A
Application number: CN202010404089.3A
Authority: CN
Inventors: 程知群; 田刚; 王飞; 尉倞浩
Original assignee: Hangzhou University Of Electronic Science And Technology Fuyang Institute Of Electronic Information Co ltd
Current assignee: Hangzhou University Of Electronic Science And Technology Fuyang Institute Of Electronic Information Co ltd; Hangzhou Dianzi University
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2020-10-20
Anticipated expiration: 2040-05-13
Also published as: CN111800455B

Abstract

The invention discloses a method for sharing a convolutional neural network based on different host data sources in a local area network. The method comprises the following steps: step S1: pushing and pulling the upper computer interface by using a Qt layout data source; step S2: building a plurality of clients of local area network plug flow data sources and a receiving client of a unique pull flow data source, and using Nginx as a data source local area network forwarding server; step S3: training a deep learning model by a convolutional neural network; step S4: and carrying out data source face classification and identification by using the model, and realizing face identification and name labeling on pictures or videos. The deep learning is combined with the transmission of the video server local area network, so that the recognition efficiency is greatly improved, the defect of poor model single-machine mode portability is overcome, the deep learning model can be shared by different hosts, and the different hosts can carry out face labeling and recognition under the condition of not copying the training model.

Description

Method for sharing convolutional neural network based on different host data sources in local area network

Technical Field

The invention mainly relates to the field of deep learning image processing and the field of local area network communication, in particular to a method for sharing a convolutional neural network based on different host data sources in a local area network

Background

With the development of scientific technology in recent years, artificial intelligence is closer to the clothes and eating habits of people. The close combination of deep learning and image processing also changes our working life style, such as face-swiping payment, face-swiping card-punching, and the like. Convolutional neural networks play a great role in the image recognition and classification process, and tend to mature in practical application, but the use of training models is often in a single-machine environment, and certain challenges exist for data from a plurality of different host data sources.

The main problem is how to deal with different host data sources, and if it is certainly not a good way to distribute the model to multiple hosts, portability cannot be guaranteed.

Therefore, in order to solve the above problems, it is necessary to provide a technical solution to solve the technical problems in the prior art.

Disclosure of Invention

Aiming at the technical problems in the prior art, the invention aims to provide a method for sharing a convolutional neural network based on different host data sources in a local area network, which comprises a convolutional neural network module, an upper computer interface module and a data source comprising a picture and video local area network transmission server module, wherein data of the data source is submitted to a sharing server to form a data source queue, and then a training model is used together, so that model sharing is realized.

In order to overcome the defects of the prior art, the technical scheme of the invention is as follows:

a method for sharing a convolutional neural network based on different host data sources in a local area network comprises the following steps:

step S1: pushing and pulling the upper computer interface by using a Qt layout data source;

step S2: building a plurality of clients of local area network plug flow data sources and a receiving client of a unique pull flow data source, and using Nginx as a data source local area network forwarding server;

step S3: training a convolutional neural network deep learning model;

step S4: carrying out data source face classification and identification by using the model to realize face identification and name labeling on pictures or videos;

the step S1 further includes:

step S11: the data source plug-flow upper computer interface realizes the functions of selecting data sources and plug-flow to the server by different hosts;

step S12: the data source is pulled to flow an upper computer interface, so that the pulling function and the data source display after the identification are finished are realized;

the step S2 further includes:

step S21: and the plug flow data source client is realized, and plug flow of different hosts to the same forwarding server is realized. Because the data source can be a picture or a video, the judgment needs to be carried out during the stream pushing, if the data source is the video direct stream pushing, if the data source is the picture, the picture is converted into a video frame, and the stream pushing is carried out. When the video is pushed, because the data storage format is different from the transmission format, the data source needs to be decoded locally, the data source is decoded into an original data frame, and then the original data frame is encoded into a format capable of being transmitted in a network. The FFmpeg is called to realize the de-coding of the video source and the sending of the frame data to the data forwarding server of the same local area network, and the forwarding server plays a role of a data source message queue;

step S22: building a video forwarding server, wherein a rtmp (real-time message transfer protocol) protocol is selected, the forwarding server needs to be ensured to support rtmp, a host computer is selected to build the forwarding server in a local area network, an Nginx proxy server is used for determining the IP address and the port number of the server, and finally, the network bandwidth is ensured to avoid the phenomenon of blocking due to network congestion;

step S23: the data stream is pulled from the IP address and the port number of the forwarding server by the unique pull stream data source client, the FFmpeg is called to pull the data stream from the IP address and the port number of the forwarding server, the current data is in a transmission format, a format which needs to be decoded and displayed in a local area is used by a convolutional network, and the data enters a convolutional neural network model to display an identified image on an upper computer or can be selected to be stored to the local area.

The step S3 further includes:

step S31: obtaining an image source;

step S32: constructing a convolutional neural network;

step S33: after the multi-classification calculation softmax is normalized, a cross entropy loss function is used for carrying out multiple iterations on the loss function, and parameters are continuously updated through random batch gradient reduction until the loss function is converged;

step S34: and improving the accuracy of the model by avoiding dropout and cross validation through overfitting. For model evaluation, classification errors may occur due to the fact that human faces may be affected by external factors, and errors are allowed to occur. Calculating identification accuracy and recall rate, and if the identification accuracy and the recall rate meet an accuracy threshold and a recall rate threshold, determining that the classification is correct;

the step S31 further includes:

step S311: for the convolutional neural network, the data source used here is that the top 100 bits of the popularity of the film and the star are selected, the 100 bits are respectively subjected to convolutional training to obtain 50 face pictures, and the accuracy of recognition is improved through image enhancement.

Step S312: cutting a data source picture according to a certain resolution, carrying out image gray processing, normalizing pixel points, and carrying out one-hot coding on a corresponding name label;

the step S32 further includes:

step S321: the convolutional neural network changes the convolutional layer and the pooling layer by modifying the AlexNet model, and the number of the types of the final full-connection layer accords with classification. The generalization capability of recognition is improved by using a ReLU activation function, and the training time is greatly shortened by using massive data source data and GPU (graphics processing unit) for parallel processing;

the step S4 further includes:

step S41: and invoking OpenCV to determine the position of the face, and making a rectangular box mark.

Step S42; and calling the model stored by the convolutional neural network for classification, displaying the name in a rectangular box, and marking 'recognition failure' if the recognition is not successful or the name is not in a face library.

By adopting the technical scheme of the invention, the invention has the following beneficial effects:

1. different hosts in the whole local area network share the convolutional neural network model, the different hosts only need to push data to a forwarding server, and then the model obtains the data from the unique server for classification and identification, so that the universality of the model is greatly improved;

2. the model is prevented from being called among different host configurations, resources are saved, and recognition errors caused by improper environment configuration are avoided.

Drawings

Fig. 1 is an overall structure diagram of a method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention.

FIG. 2 is an interface diagram of an upper computer of a plug-flow client of the present invention.

FIG. 3 is a diagram of a pull streaming client host computer interface of the present invention.

FIG. 4 is a schematic diagram of FFmpeg decoding encoding and de-streaming a video file in accordance with the present invention.

FIG. 5 is a schematic diagram of the FFmpeg pull stream and decoding code for a video file according to the present invention.

Fig. 6 is a diagram of the forwarding process of the video stream at the Nginx server according to the present invention.

FIG. 7 is a schematic diagram of the convolutional neural network structure of the present invention.

Detailed Description

The technical solution provided by the present invention will be further explained with reference to the accompanying drawings.

The invention provides a method for sharing a convolutional neural network based on different host data sources in a local area network, which aims to improve the universality of a model and ensure that a fixed deep learning convolutional neural network model can be used for identifying data sources of the whole local area network, reduce the resource overhead and avoid the result that the model cannot be operated due to different configuration environments.

Fig. 1 is a general structural diagram of a method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention. Overall, the present invention includes 4 steps, step S1: pushing and pulling the upper computer interface by using a Qt layout data source; step S2: building a plurality of clients of local area network plug flow data sources and a receiving client of a unique pull flow data source, and using Nginx as a data source local area network forwarding server; step S3: training a convolutional neural network deep learning model; step S4: carrying out data source face classification and identification by using the model to realize face identification and name labeling on pictures or videos;

step S1 mainly provides a user interface for multiple plug flow clients and a single plug flow client, which greatly improves the efficiency of human-computer interaction. The Qt library is used as a main upper computer writing mode, and the method specifically comprises the following steps:

fig. 2 is a diagram of the interface of the upper computer of the stream pushing client in the step S11 in the method for sharing the convolutional neural network based on different host data sources in the local area network according to the present invention, which mainly completes the selection of the file, the display of the file, and the determination of the IP of the forwarding server to further achieve the stream pushing effect. Fig. 3 is a drawing of the interface diagram of the upper computer of the pull client in the step S12 in the method for sharing the convolutional neural network based on different host data sources in the local area network according to the present invention, which mainly completes the IP pull from a specific forwarding server and whether to select a model for face labeling using the convolutional network to finally display data or save data.

Step S2 is to decode the video data into the original data, and encode the original data into the data with a specific format that can be pushed for streaming transmission in the local area network, and it is necessary to ensure that Nginx forwards the rtmp protocol supporting video transmission as a server, and the method specifically includes the following steps:

Fig. 4 shows a specific encoding/decoding and stream pushing process in step S21 of the method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention, where an FFmpeg library function is used for operation, and the decoding is performed by reading local data, and then a transmission format is determined, so as to generate a new IP address of a server to which a transmission data stream is finally explicitly forwarded, and implement data source sending by a stream pushing function. Fig. 5 shows the forwarding process of the video server in the above step S22 in the method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention, where nginnx may receive multiple stream pushing ports, so that there is only one data source corresponding to a following model, and the previous actual data source is in a transparent state to the model, so as to implement a many-to-one function, so that it is as if data from one host is shared for a following recognition model. Fig. 6 shows that in the method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention, the step S23 uses a library function of FFmpeg to pull a stream of the stream data of the server, then decodes the stream of the stream data into a local data format, inputs the decoded data format into a subsequent recognition classification model, performs model recognition, and displays the model at the host.

Step S3 is mainly the building process of convolution nerve network, receiving gray image, through several convolution layers and several pooling layers, keeping characteristic point and finally outputting in the whole connection layer, through comparing with real result, continuously updating convolution parameter, through several iterations until loss function convergence, finally saving model. The method specifically comprises the following steps:

step S31: obtaining an image source;

step S32: constructing a convolutional neural network;

step S34: the accuracy of the model is improved by avoiding dropout and cross validation through overfitting; for model evaluation, as a film role may have an influence on face classification due to hairstyle makeup, the identification accuracy and recall rate are calculated, and if the identification accuracy and recall rate meet the accuracy threshold and recall rate threshold, classification is considered to be correct;

the step S31 further includes:

step S311: for a data source used in the convolutional neural network, selecting the top 100 bits of the star popularity of the film and television, performing convolutional training on fifty face pictures obtained from the 100 bits respectively, and improving the identification accuracy through image enhancement, and for the data source image, a crawler program can be written by using python, so that the preparation time is saved by crawling a hundred-degree picture library, and 5000 pictures are obtained in total and are divided into a training set and a testing set;

the step S32 further includes:

step S321: the convolutional neural network modifies the AlexNet model, adds convolutional layers and pooling layers, and accords with the classified number of types for the final full-connection layer. By using the ReLU activation function, the training time is greatly shortened due to the huge data source data and the parallel processing of the GPU;

fig. 7 shows that in the above step S32 of the method for sharing a convolutional neural network based on different host data sources in a local area network according to the present invention, by setting the dimension of the input data, the image data is preprocessed and cut into uniform resolution, the uniform resolution is pooled by multiple layers of rolling, and finally the image data is output by a full connection layer, where the output parameter is equal to the number of categories, here, the number of stars is 100, and thus the full connection layer has 100 parameters.

Step S4: the method comprises the steps of carrying out classification and identification on movie videos or star pictures by using a model, realizing actor identification and name labeling on the videos or the pictures, and clicking and using a convolutional neural network model on a pull stream upper computer, wherein the played videos are pictures with 100-bit star name labels. Therefore, different computers in the local area network can perform face labeling recognition. The step S4 further includes:

The above description of the embodiments is only intended to facilitate the understanding of the method of the invention and its core idea. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for sharing a convolutional neural network based on different host data sources in a local area network is characterized by comprising the following steps:

step S3: training a deep learning model by a convolutional neural network;

the step S1 further includes:

the step S2 further includes:

step S21: setting a plug flow data source client to realize plug flow of different hosts to the same forwarding server; firstly, judging whether a data source is a picture or a video, if so, directly pushing the stream, and if so, converting the picture into a video frame for pushing the stream; decoding a data source into an original data frame, and then coding into a format capable of being transmitted in a network; calling the FFmpeg to realize the decoding of the video source and send the frame data to a data forwarding server of the same local area network;

step S22: the method comprises the steps of building a video forwarding server, optionally building a forwarding server on one host in a local area network, determining an IP address and a port number of the server by using a Nginx proxy server, and transmitting data by adopting an rtmp protocol, wherein the Nginx proxy server is used for receiving a plurality of stream pushing terminals to realize a many-to-one transmission function, and data received by a rear-end identification model is equivalent to data from one host to realize sharing;

step S23: setting a unique pull stream data source client, calling FFmpeg to pull a data stream from an IP address and a port number of a forwarding server, decoding the data into a local picture format for a convolution network to use, and displaying an identified image on an upper computer or selectively storing the identified image to the local through a convolution neural network model;

the step S3 further includes:

step S31: obtaining an image source;

step S32: constructing a convolutional neural network;

step S34: the accuracy of the model is improved by avoiding dropout and cross validation through overfitting; and calculating the identification accuracy and the recall rate, and if the identification accuracy and the recall rate meet the accuracy threshold and the recall rate threshold, determining that the classification is correct.

2. The method according to claim 1, wherein the step S31 further comprises:

step S311: for the data source used in the convolutional neural network, the first 100 bits of the popularity of the film and the television are selected, the 100 bits are respectively subjected to convolutional training by obtaining 50 face pictures, and the recognition accuracy is improved through image enhancement;

the step S32 further includes:

step S321: the convolutional neural network changes the convolutional layer and the pooling layer by modifying the AlexNet model, and the number of the types of the final full-connection layer accords with classification.

3. The method according to claim 1, wherein the step S4 further comprises:

step S41: calling OpenCV to determine the position of the face, and marking a rectangular frame;