CN108319888A

CN108319888A - The recognition methods of video type and device, terminal

Info

Publication number: CN108319888A
Application number: CN201710041149.8A
Authority: CN
Inventors: 谢世鹏
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2017-01-17
Filing date: 2017-01-17
Publication date: 2018-07-24
Anticipated expiration: 2037-01-17
Also published as: CN108319888B

Abstract

This application discloses a kind of recognition methods of video type and device, terminals.Wherein, this method includes：Obtain the various dimensions characteristic information of frame image in video to be identified, wherein the various dimensions characteristic information includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；Unstructured data in the various dimensions characteristic information is converted into structural data, and merges the characteristic information of transformed the multiple dimension, obtains fusion structure characteristic information；The video type of the video to be identified is determined according to the fusion structure characteristic information.

Description

The recognition methods of video type and device, terminal

Technical field

This application involves video identification fields, recognition methods and device, calculating in particular to a kind of video type Machine terminal.

Background technology

Depth learning technology based on neural network is widely used in the every field of artificial intelligence, including natural language Processing, speech recognition, image recognition etc..In illegal picture (such as the illegal picture such as porny) identification, this depth The neural network of habit by the high layer information of multitiered network structure extraction image content, can be directly used for judging picture whether right and wrong Method picture.In illegal video identification, the frame image in video is handled respectively using illegal picture identification model, is then melted The result for closing all frames judges whether to belong to illegal video.In addition, depth learning technology is also used for picture character identification, pass through The high layer information for extracting image content, can relatively be recognized accurately the word of picture.Word knowledge is carried out to every frame image in video Not, then judge whether to include illegal keyword, then merge the result of all frames to determine whether belonging to illegal video.

In addition to this, deep learning feature can be extracted to picture and establishes index, and illegal picture is built by investigation technology Lithol draws, and prevention and control are carried out to the illegal picture being put in storage.By judging whether in index database respectively to every frame image in video, Then the result for merging all frames judges whether video belongs to illegal video.

It is regarded however, the illegal video recognition methods None- identified based on illegal picture model goes out the illegal of unknown illegal type Frequently, lead to that accuracy rate is low, stability is poor, be unfavorable for being extended to other different application scenes.And it is based on picture character identification technology Illegal video recognition methods can only identify the illegal video for including illegal keyword in video, illegal in many scenes regarding Frequency not necessarily contains illegal keyword, causes this method scope of application narrow, and be difficult to cover most illegal video, cruelly It divulges a secret danger.And the illegal video recognition methods based on investigation index technology can only identify video frame images in index database In, to unknown illegal video None- identified.

And above-mentioned several method can not be merged their result because it is different types of data.On furthermore After several method is stated to every frame image procossing in video, when merging the result of all frames, it is only capable of considering the feature of current dimension Information, the characteristic dimension that can be utilized is single, leads to that such illegal video recognition methods accuracy rate is low, poor robustness.

For above-mentioned problem, currently no effective solution has been proposed.

Invention content

The embodiment of the present application provides recognition methods and device, the terminal of a kind of video type, at least to solve The relatively low technical problem of illegal video discrimination.

According to the one side of the embodiment of the present application, a kind of recognition methods of video type is provided, including：It obtains and waits knowing The various dimensions characteristic information of frame image in other video, wherein the various dimensions characteristic information includes the characteristic information of multiple dimensions, The image recognition rule that the characteristic information of different dimensions uses is different；By the unstructured data in the various dimensions characteristic information Structural data is converted to, and merges the characteristic information of transformed the multiple dimension, obtains fusion structure characteristic information； The video type of the video to be identified is determined according to the fusion structure characteristic information.

According to the another aspect of the embodiment of the present application, a kind of recognition methods of video type is additionally provided, including：Acquisition waits for Identify the various dimensions characteristic information of frame image in video, wherein the various dimensions characteristic information includes the feature letter of multiple dimensions Breath, the image recognition rule that the characteristic information of different dimensions uses are different；The various dimensions characteristic information is input to advance instruction The image recognition model got exports the structured features information after various dimensions feature fusion；According to the structuring Characteristic information identifies the video type of the video to be identified.

According to the another aspect of the embodiment of the present application, a kind of computer equipment is additionally provided, including：Input interface is used for Receive the various dimensions characteristic information of frame image in the video to be identified of input, wherein the various dimensions characteristic information includes multiple The characteristic information of dimension, the image recognition rule that the characteristic information of different dimensions uses are different；Processor is used for the multidimensional Unstructured data in degree characteristic information is converted to structural data, and merges the feature letter of transformed the multiple dimension Breath, obtains fusion structure characteristic information；And determine regarding for the video to be identified according to the fusion structure characteristic information Frequency type；Output interface, for exporting the video type.

According to the another aspect of the embodiment of the present application, a kind of computer equipment is additionally provided, is carried out with user for providing Interactive interactive interface, the interactive interface include：First control, the various dimensions for showing frame image in video to be identified are special Reference ceases, wherein the various dimensions characteristic information includes the characteristic information of multiple dimensions, the characteristic information use of different dimensions Image recognition rule is different；Second control is converted for showing by the unstructured data in the various dimensions characteristic information The structural data arrived；Third control, for showing the structuring spy merged the transformed various dimensions characteristic information and obtained Reference ceases；4th control, the video class for showing the video to be identified identified according to fusion structure characteristic information Type.

According to the another aspect of the embodiment of the present application, a kind of identification device of video type is additionally provided, including：Obtain mould Block, the various dimensions characteristic information for obtaining frame image in video to be identified, wherein the various dimensions characteristic information includes multiple The characteristic information of dimension, the image recognition rule that the characteristic information of different dimensions uses are different；Conversion module, being used for will be described more Unstructured data in dimensional characteristics information is converted to structural data, and merges the feature of transformed the multiple dimension Information obtains fusion structure characteristic information；Identification module, for being waited for according to described in fusion structure characteristic information determination Identify the video type of video.

According to the another aspect of the embodiment of the present application, a kind of recognition methods of video type is additionally provided, including：Extraction waits for Identify the structured features information of the frame image of video, the structured features information is for reflecting the frame image and specified class The similarity degree of type image；The text information in video frame is extracted using picture character identification technology, and is weighed using based on Gauss Non-structured text message is converted to first structure information by the dictionary transform method of weight；It is built using the picture of specified type Vertical index database, the investigation that sequence of frames of video is extracted with investigation technology indexes planting modes on sink characteristic, then with the dictionary based on Gauss weight Non-structured text message is converted to the second structured message by transform method；Merge the structured features letter extracted Breath, first structure characteristic information and the second structured features information, obtain fusion structure characteristic information；Use the fusion Structured features information is trained, and obtains final video identification model；Video input to be identified to the video will be known Other model, and export the recognition result to the video to be identified.

In the embodiment of the present application, it is determined using the structural data using the various dimensions characteristic information after fusion to be identified The mode of video, improves the identification accuracy of video to be identified, and then solves the relatively low technology of illegal video discrimination Problem.

Description of the drawings

Attached drawing described herein is used for providing further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please do not constitute the improper restriction to the application for explaining the application.In the accompanying drawings：

Fig. 1 is the network architecture schematic diagram according to a kind of pornographic video recognition system of the embodiment of the present application；

Fig. 2 is the pornographic video identification principle schematic according to a kind of fusion various features dimension of the embodiment of the present application；

Fig. 3 is the training flow diagram according to a kind of optional video identification model of the embodiment of the present application；

Fig. 4 is a kind of optional Gauss model schematic diagram according to the embodiment of the present application；

Fig. 5 is the structural schematic diagram according to a kind of terminal of the embodiment of the present application；

Fig. 6 is the recognition methods flow diagram according to a kind of video type of the embodiment of the present application；

Fig. 7 is the structural schematic diagram according to a kind of identification device of video type of the embodiment of the present application；

Fig. 8 is the recognition methods flow diagram according to a kind of video type of the embodiment of the present application；

Fig. 9 is the structural schematic diagram according to another terminal of the embodiment of the present application；

Figure 10 is the structural schematic diagram according to another terminal of the embodiment of the present application.

Specific implementation mode

In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, technical solutions in the embodiments of the present application are clearly and completely described, it is clear that described embodiment is only The embodiment of the application part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people The every other embodiment that member is obtained without making creative work should all belong to the model of the application protection It encloses.

It should be noted that term " first " in the description and claims of this application and above-mentioned attached drawing, " Two " etc. be for distinguishing similar object, without being used to describe specific sequence or precedence.It should be appreciated that using in this way Data can be interchanged in the appropriate case, so as to embodiments herein described herein can in addition to illustrating herein or Sequence other than those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover It includes to be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment to cover non-exclusive Those of clearly list step or unit, but may include not listing clearly or for these processes, method, product Or the other steps or unit that equipment is intrinsic.

The embodiment of the present application for ease of understanding is as follows by explanation of technical terms involved in the embodiment of the present application below：

Various dimensions characteristic information：Using the characteristic information for the object to be identified that a variety of recognition rules respectively obtain；

Structural data：Data at once store in the database, the number that can be realized come logical expression with bivariate table structure According to, such as number, symbol；

Unstructured data：It can not be known as unstructured data with the data of number or unified representation, such as：Text Sheet, image, sound, webpage etc..

Pornographic score value：Similarity degree for indicating images to be recognized and pornographic image.

Embodiment 1

In current illegal video recognition methods, include mainly：Illegal video recognition methods based on illegal picture model, Illegal video recognition methods based on picture character identification technology, the illegal picture recognition methods based on investigation index technology. It is non-with this by establishing illegal classification model to video frame images in illegal video recognition methods based on illegal picture model Then method disaggregated model merges the result of all frames to determine whether being illegal video to every frame image recognition in video.This The illegal characteristic information that image content includes only is extracted in the recognition methods of class illegal video, leads to the illegal video to UNKNOWN TYPE None- identified, such as in the identification process of pornographic video, it is only capable of the features of skin colors of video image, characteristics of human body in extraction video Etc. information realize the identification to pornographic image, None- identified includes pornographic pornographic video.

It is illustrated by taking pornographic video as an example below.

The identifying schemes of current pornographic video are mainly the following：Pornographic video identification side based on porny model Method, the pornographic video frequency identifying method based on pornographic character recognition technology, the porny recognition methods based on investigation index technology. Wherein：

In the pornographic video frequency identifying method based on picture character identification technology, using picture character identification technology to video In every frame image carry out Text region, extract text message, then merge the result of all frames to determine whether belonging to pornographic Video.But such method can only identify the pornographic video comprising word, the letterless pornographic video of None- identified.

In the porny recognition methods based on investigation index technology, pass through the pornographic video frame identified to history Image establishes index, is stored in pornographic video frame images library, is searched whether every frame image in video in rope using investigation technology Draw in library and how many analog result, then merges the result of all frames to determine whether belonging to pornographic video.But such side Method is to establish to index and be stored in database by the video frame images to identified mistake, can only identify that similar known pornographic regards Frequently, the newly-increased pornographic video of None- identified.

Because above-mentioned several pornographic video frequency identifying methods are played a role in single features dimension, can be above-mentioned by merging Several method in several method, the shortcomings that making up respective method mutually using respective advantage.Then, these types of pornographic It is that the pornographic video frequency identifying method based on porny model obtains the result is that structured message, is based in video frequency identifying method It is that the pornographic video frequency identifying method of picture character identification technology obtains the result is that unstructured information, based on investigation index technology It is that pornographic video frequency identifying method obtains the result is that semi-structured information, causes these types of method that cannot directly merge.If only melting Final judging result is closed, the information between each characteristic dimension can be caused to lose, the feature of other characteristic dimensions can not be utilized, So that pornographic video recognition result accuracy rate final after fusion is relatively low, and robustness is poor.

The embodiment of the present application is mainly used for identifying pornographic video, special by merging picture character identification feature, investigation index The various dimensions features such as sign, porny characteristic of division solve conventional method and identify that pornographic video accuracy rate is low, expansion is poor, steady The problems such as qualitative low.In traditional pornographic video frequency identifying method, by the individual porny disaggregated model of training to video In frame image judge whether to belong to porny respectively, then merge the recognition result of all frames to determine whether belonging to pornographic Video.But in complicated Internet scene, pornographic video type type is various, and variation type is more and fast, based on single The pornographic video frequency identifying method of porny disaggregated model is difficult to cope with.

The embodiment of the present application proposes to utilize picture character feature, the investigation multidimensional such as index feature and porny characteristic of division Feature is spent, on the time dimension of video frame and in the characteristic dimension of video frame, is identified by mixing together various dimensions feature The method of pornographic video.In this way, this method utilizes the excellent of picture character identification, investigation index technology and porny disaggregated model Gesture overcomes their disadvantages respectively, solves the problems, such as that the pornographic video accuracy rate of conventional method identification is low and poor robustness.

Before the specific implementation of description the present embodiment, it will describe to can be used for realizing the application principle with reference to figure 1 One suitable network architecture.Fig. 1 is the network architecture according to a kind of pornographic video recognition system of the embodiment of the present application Schematic diagram.101 indicate each terminal (such as mobile phone, tablet computer, computer etc.) in distributed type assemblies in Fig. 1, and 103 indicate In distributed network for the host of the pornographic video identification model of training (i.e. pornographic video identification model running in the host, The host computer model carries out the identification of pornographic video).As seen from Figure 1, in model training stage, since it is desired that largely Data calculate, need Distributed Calculation cluster support；And in forecast period (i.e. pornographic video identification stage), then it can use Common CPU computing resources.

It should be noted that host 103 can be to be used to monitor the server of pornographic video in network or provide pornographic to regard The virtual cloud computing resources of frequency identification function, but not limited to this.

Based on above-mentioned business scenario, the embodiment of the present application provides a kind of pornographic video identification of fusion various features dimension Method, the principle of this method will be as shown in Fig. 2, porny model, picture character identification facility and investigation index tool will be based on Various dimensions characteristic information be converted to structural data after merged, then export final recognition result.As shown in figure 3, The workflow of the neural network model used in following scene is as follows：Step S302 inputs various dimensions characteristic information；Step Characteristic information in video frame images is converted to structural data by S304, the dictionary conversion method based on Gauss weight；Step The structural data is merged using neural network model, obtains final recognition result by S306；Step S308, output Recognition result.Wherein, specifically, whole flow process is divided into preparation stage and service stage, is as follows：

Preparation stage：

Step 1：It collects pornographic video and non-pornographic video data and marks, be respectively labeled as positive example and negative example.

Step 2：Prepare available porny model, picture character identification facility, investigation index tool.

It should be noted that since the data volume of mobile phone pornographic video and non-pornographic video is larger in the stage, it can To be realized using the terminal in distributed type assemblies shown in FIG. 1.

Service stage：

Step 3：The characteristic information for extracting a variety of dimensions in video frame, respectively includes following processing step：

1) the structured features information of the frame image of video to be identified is extracted, above structure characteristic information is for reflecting The similarity degree of frame image and specified type image is stated, optionally, which may be used following manner realization, but be not limited to This：The pornographic score value that video frame is calculated with porny model obtains this feature sequence：S₁={ s_i,1,s_i,2,...,s_i,n, In, n indicates the quantity (number of frames) of frame, i=1.

2) picture character identification feature is extracted：The text information in video frame is extracted using picture character identification technology, and Non-structured text message is converted to first structure information using the dictionary transform method based on Gauss weight, i.e.,：Profit The text information in video frame is extracted with picture character identification technology, with the dictionary transform method based on Gauss weight non-structural The text message of change is converted to structured message, is convenient for subsequent fusion.Dictionary transform method such as formula based on Gauss weight Shown in 1-1：

g_j=Gaussian weights of index j,

x_i=score of index i. 1-1

Wherein, g_jIndicate the Gauss weight of index j, x_iIndicate that index is the score value of i, x_iIndicate the unstructured letter per frame Breath is converted to the intermediate comparison scores of structured message because text message can not quantify, and traditional language model computing cost compared with Greatly, and accuracy rate is low, performance is unstable, therefore utilizes pornographic keyword dictionary, is obtained often according to the pornographic keyword quantity of hit The pornographic degree of frame image is given a mark.Gauss weight schematic diagram is as shown in Figure 4.Using the contextual information between video frame, consider The pornographic degree marking of video frame, introduces Gauss weight and obtains the last pornographic score of the frame image before and after present frame.

Finally obtain this feature sequence：S₂={ s_i,1,s_i,2,...,s_i,n, n=numberofframes, i=2.

3) extraction investigation index planting modes on sink characteristic：The index database established using the picture of specified type is regarded with the extraction of investigation technology The investigation of frequency frame sequence indexes planting modes on sink characteristic, then with the dictionary transform method based on Gauss weight non-structured text message The second structured message is converted to, that is, the index database for utilizing porny to establish extracts the row of sequence of frames of video with investigation technology Index planting modes on sink characteristic is looked into, non-structured text message is then converted to structuring with the dictionary transform method based on Gauss weight Information is convenient for subsequent fusion.Based on the dictionary transform method of Gauss weight as shown in formula 1-1.

It because the result of investigation index is unstructured information, can not quantify, therefore consider according to index database hit results Quantity obtains the pornographic degree marking of every frame image.Equally, the contextual information between video frame is recycled, before considering present frame The pornographic degree of rear video frame is given a mark, and is introduced Gauss weight and is obtained the last pornographic score of the frame image.

Finally obtain this feature sequence：S₃={ s_i,1,s_i,2,...,s_i,n, n=numberofframes, i=3.

Step 4：It is special to merge the above structure characteristic information, first structure characteristic information and the second structuring extracted Reference ceases, and obtains fusion structure characteristic information；It is trained using above-mentioned fusion structure characteristic information, obtains final regard Frequency identification model；Then will be by video input to be identified to above-mentioned video identification model, and export to above-mentioned video to be identified Recognition result utilizes the information of various features dimension, training neural network fusion feature to obtain final pornographic video identification Model.Because the pornographic degree of current video frame can not only utilize the contextual information between video frame, but also can utilize The information of different characteristic dimension, therefore the letter between video frame had both been fully considered by convolution method using nerual network technique Breath, it is further contemplated that the information of different characteristic dimension, training obtains final pornographic video identification model, exports the video and belongs to pornographic The final score of video

It should be noted that since the data volume involved in the model training process in step 1-4 is larger, it can be with It is realized using the terminal in distributed type assemblies shown in FIG. 1.

Step 5：The self-feedback of model and update.To input video, the above method proposed using the application is somebody's turn to do Whether video belongs to pornographic video, flows into the update feedback module, according to requiring, judges whether the video needs feedback to flow back into In porny disaggregated model, if feedback is needed to flow back into index database, if feedback is needed to flow back into picture character knowledge In other module.In practical applications, which ensures that the method that the present embodiment proposes constantly realizes that self-feedback optimizes.

After obtaining the pornographic video identification model after training, the host 103 in Fig. 1 can be used to run the mould Type carries out the identification of pornographic video, for example, in one alternate embodiment, using video to be identified as input, being input to the color In feelings video identification model, which carries out internal identification using above-mentioned principle, and exports video to be identified Type, that is, judge whether the type of the video to be identified is pornographic video.

In practical application scene, the quality and arithmetic result amalgamation mode of labeled data are to the accurate of final discrimination model Rate and recall rate are particularly important.In order to embody advantage of the invention in normal data differentiation, under electric business application scenarios, we It is compared with conventional method, as a result such as the following table 1：

Table 1

Algorithm	Accuracy rate	Recall rate	Manual examination and verification amount
				Conventional method	0.6%	62.2%	42%
The method that the application proposes	15%	65%	23%

Data in table 1 are calculated using video data, and as can be seen from the table, the application is substantially better than tradition Method, in the case where recall rate is almost without sacrifice and promotion, accuracy rate is obviously improved, and manual examination and verification amount is greatly lowered.

Optionally, the realization of python language may be used in the present embodiment, but not limited to this, in model training stage, because It needs a large amount of data to calculate, Distributed Calculation cluster is needed to support；And forecast period, then common cpu can be used to calculate Resource.When using investigation index technology, if index database is huge, a large amount of computing resources and storage resource is needed to support.

Based on mentioned above principle, a kind of recognition methods of video type is present embodiments provided, including：Extract video to be identified Frame image structured features information, above structure characteristic information is used to reflect above-mentioned frame image and specified type image Similarity degree；；Merge above structure characteristic information, first structure characteristic information and the second structured features letter extracted Breath, obtains fusion structure characteristic information；It is trained using above-mentioned fusion structure characteristic information, obtains final video and know Other model；Will be by video input to be identified to above-mentioned video identification model, and export the recognition result to above-mentioned video to be identified.

It should be noted that above example is illustrated by taking the identification of pornographic video as an example, however it is not limited to color (such as advertisement video) can also be identified to certain types of legal video in the identification of feelings video, the embodiment of the present application, Other kinds of illegal video (such as being related to the illegal activities such as gambling) can be identified, it is flexible with specific reference to actual conditions Setting.

In the present embodiment, since the characteristic information of various dimensions has been converted to structural data by unstructured data, And the type that video to be identified is determined using the structural data of the various dimensions characteristic information after fusion, improves to be identified regard The identification accuracy of frequency, and then solve the relatively low technical problem of illegal video discrimination.

Embodiment 2

According to the embodiment of the present application, a kind of embodiment of the method for the recognition methods of video type is additionally provided, needs to illustrate , step shown in the flowchart of the accompanying drawings can hold in the computer system of such as a group of computer-executable instructions Row, although also, logical order is shown in flow charts, and it in some cases, can be with different from sequence herein Execute shown or described step.

The embodiment of the method that the embodiment of the present application one is provided can be in mobile terminal, terminal or similar fortune It calculates and is executed in device.Fig. 5 shows a kind of terminal (or mobile device) of the recognition methods for realizing video type Hardware block diagram.As shown in figure 5, terminal 50 (or mobile device 50) may include one or more (adopted in figure With 502a, 502b ... ..., 502n is shown) processor 502 (processor 502 can include but is not limited to Micro-processor MCV or The processing unit of programmable logic device FPGA etc.), memory 505 for storing data and the biography for communication function Defeated module 506.In addition to this, can also include：Display, input/output interface (I/O interfaces), universal serial bus (USB) Port (can as a port in the port of I/O interfaces by including), network interface, power supply and/or camera.This field is general Logical technical staff is appreciated that structure shown in fig. 5 is only to illustrate, and does not cause to limit to the structure of above-mentioned electronic device. For example, terminal 50 may also include more either less components than shown in Fig. 5 or have different from shown in Fig. 5 Configuration.

It is to be noted that said one or multiple processors 502 and/or other data processing circuits lead to herein Can often it be referred to as " data processing circuit ".The data processing circuit all or part of can be presented as software, hardware, firmware Or any other combination.In addition, data processing circuit can be single independent processing module or all or part of be attached to meter In any one in other elements in calculation machine terminal 50 (or mobile device).As involved in the embodiment of the present application, The data processing circuit controls (such as the selection for the variable resistance end path being connect with interface) as a kind of processor.

Memory 505 can be used for storing the software program and module of application software, such as () side in the embodiment of the present application Corresponding program instruction/the data storage device of method, processor 502 by operation be stored in the software program in memory 505 with And module realizes the leak detection method of above-mentioned application program to perform various functions application and data processing.It deposits Reservoir 505 may include high speed random access memory, may also include nonvolatile memory, as one or more magnetic storage fills It sets, flash memory or other non-volatile solid state memories.In some instances, memory 505 can further comprise relative to place The remotely located memory of device 502 is managed, these remote memories can pass through network connection to terminal 50.Above-mentioned network Example include but not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.

Transmitting device 506 is used to receive via a network or transmission data.Above-mentioned network specific example may include The wireless network that the communication providers of terminal 50 provide.In an example, transmitting device 506 includes that a network is suitable Orchestration (Network Interface Controller, NIC), can be connected with other network equipments by base station so as to Internet is communicated.In an example, transmitting device 506 can be radio frequency (Radio Frequency, RF) module, For wirelessly being communicated with internet.

Display can such as touch-screen type liquid crystal display (LCD), which may make that user can be with The user interface of terminal 50 (or mobile device) interacts.

Under above-mentioned running environment, this application provides the recognition methods of video type as shown in FIG. 6.Fig. 6 is basis The flow chart of the recognition methods of the video type of the embodiment of the present application.As shown in fig. 6, the method comprising the steps of S602-S608：

Step S602 obtains the various dimensions characteristic information of frame image in video to be identified, wherein above-mentioned various dimensions feature letter Breath includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；

As the alternative embodiment of the application, the characteristic information of different dimensions can include but is not limited to：Based on figure The characteristic information that piece identification model obtains；The characteristic information obtained based on the Text region in picture；With preset kind database In with the frame image similarity be more than predetermined threshold value picture number.Optionally, the characteristic information difference of above-mentioned multiple dimensions Corresponding to characteristic information used in following image recognition rule：Based on porny model, picture character identification facility and row Look into index tool.

Unstructured data in above-mentioned various dimensions characteristic information is converted to structural data by step S604；And it merges Transformed above-mentioned various dimensions characteristic information, obtains fusion structure characteristic information；

In one alternate embodiment, which can be accomplished by the following way, but not limited to this：For above-mentioned multidimensional The characteristic information for spending each dimension in characteristic information, by the characteristic information of above-mentioned each dimension by unstructured data according to pre- If rule is converted to the characteristic value of above-mentioned frame image, wherein features described above value is for reflecting above-mentioned frame image and specified type figure The similarity degree of picture.Wherein, features described above value is similar to specified type image for embodying according to ad hoc rules determination The value of degree, such as the score value given a mark to frame image according to ad hoc rules, which can be according to a large amount of statistics Data determine.

Optionally, features described above value can be determined according at least one of mode：1) according to features described above information with it is upper The correspondence for stating characteristic value determines features described above value；2) quantity according to characteristic parameter in above-mentioned frame image determines features described above Value；3) current characteristic value according to the contextual information of above-mentioned frame image and above-mentioned frame image determines that above-mentioned frame image is final Characteristic value.

Realization method 1) is planted for the, can be accomplished by the following way, but not limited to this：Characteristic value and specific spy are set The correspondence of reference breath directly will be specific with this when there is the specific characteristic information in the frame image of video to be identified Characteristic value of the corresponding value of characteristic information as frame image.

2) plant realization method for the, can by above-mentioned quantity directly as characteristic value, can also to above-mentioned quantity according to Characteristic value after certain operation rule (such as open radical sign processing etc.) processing as above-mentioned frame image.

Realization method 3) is planted for the, following implemented form can be shown as, but not limited to this：According to above-mentioned frame image institute The characteristic value of frame before and after frame, the weight of above-mentioned front and back frame distribution and the current characteristic value of above-mentioned frame image determine above-mentioned frame The final characteristic value of image；The video type of above-mentioned video to be identified is determined according to above-mentioned fusion structure characteristic information.

In one alternate embodiment, since the characteristic information utilized in different pornographic video frequency identifying methods is different, it is difficult to Directly they are merged, the present embodiment proposes that a kind of dictionary transform method based on Gauss weight is unstructured for converting Data be structuring data pass through neural network learning skill then in conjunction with the information of different characteristic dimension between video frame Art merges various features dimension to determine whether for pornographic video.By being the dictionary transformation based on Gauss weight in this present embodiment Method is converted to non-structured data the data of structuring, and the characteristic information of different dimensions is converted to same dimension Different characteristic information, therefore, for subsequent feature fusion boosting algorithm robustness tool have very great help.In addition, Due to simply merge different result can between missing video frames, the information between different characteristic dimension, in the present embodiment Not only the contextual information between frame had been utilized, but also using the information of different characteristic dimension, a variety of spies are merged by nerual network technique The information of sign dimension obtains final judging result, improves the accuracy rate and robustness of algorithm.

Step S606 identifies the video type of above-mentioned video to be identified according to fusion structure characteristic information.

Optionally, step S606 can show as following implemented form, but not limited to this：Believe according to fusion structure feature Breath determines the characteristic value of all above-mentioned frame images in above-mentioned video to be identified；Characteristic value according to all above-mentioned frame images determines State the characteristic value of video to be identified；When the characteristic value of above-mentioned video to be identified is more than predetermined threshold value, above-mentioned to be identified regard is determined The video type of frequency is specified type.

It should be noted that for each method embodiment above-mentioned, for simple description, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to preferred embodiment, involved action and module not necessarily the application It is necessary.

Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but it is very much In the case of the former be more preferably embodiment.Based on this understanding, the technical solution of the application is substantially in other words to existing The part that technology contributes can be expressed in the form of software products, which is stored in a storage In medium (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, calculate Machine, server or network equipment etc.) execute each embodiment of the application described in method.

Embodiment 3

According to the embodiment of the present application, a kind of device for implementing the recognition methods of above-mentioned video type is additionally provided, such as Shown in Fig. 7, which includes：

Acquisition module 70, the various dimensions characteristic information for obtaining frame image in video to be identified, wherein the various dimensions Characteristic information includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；

Conversion module 72, for the unstructured data in the various dimensions characteristic information to be converted to structural data, And the characteristic information of transformed the multiple dimension is merged, obtain fusion structure characteristic information；

Identification module 74, the video class for determining the video to be identified according to the fusion structure characteristic information Type.

It should be noted that the modules in the present embodiment are can be realized by the form of software or hardware, For the latter, can be accomplished by the following way, but not limited to this：Above-mentioned modules are located in same processor；On alternatively, The form of modules in any combination is stated to be located in different processors.

It should be noted that the preferred embodiment in the present embodiment may refer to the associated description in embodiment 1-2, this Place repeats no more.

Embodiment 4

The present embodiment provides a kind of recognition methods of video type, as shown in figure 8, this method includes：

Step S802, the image recognition model that video input to be identified is obtained to advance training, exports various dimensions feature Information fusion after structured features information, wherein described image identification model be according in video to be identified frame image it is more The model that dimensional characteristics information is trained, the various dimensions characteristic information include the characteristic information of multiple dimensions, different The image recognition rule that the characteristic information of dimension uses is different.

In one alternate embodiment, the characteristic information that structured features information obtains in the following manner is stated：For upper The characteristic information for stating each dimension in various dimensions characteristic information, by the characteristic information of above-mentioned each dimension by unstructured data The characteristic value of above-mentioned frame image is converted to according to preset rules, wherein features described above value is for reflecting above-mentioned frame image and specifying The similarity degree of types of image.Wherein, features described above value is to be used to embody and specified type image according to what ad hoc rules determined Similarity degree value, such as the score value given a mark to frame image according to ad hoc rules, which can be according to a large amount of Statistical data determine.

Step S804 identifies the video type of the video to be identified according to the fusion structure characteristic information.

Embodiment 5

The present embodiment provides a kind of computer equipment, Fig. 9 is another terminal according to the embodiment of the present application Structural schematic diagram.As shown in figure 9, including：Input interface 90, the various dimensions for receiving frame image in the video to be identified inputted Characteristic information, wherein above-mentioned various dimensions characteristic information includes the characteristic information of multiple dimensions, and the characteristic information of different dimensions uses Image recognition rule it is different；Processor 92, for the unstructured data in above-mentioned various dimensions characteristic information to be converted to knot Structure data, and the characteristic information of transformed above-mentioned multiple dimensions is merged, obtain fusion structure characteristic information；And according to upper State the video type that fusion structure characteristic information determines above-mentioned video to be identified；Output interface 94, for exporting above-mentioned video Type.

Optionally, which can be the server on network, or the virtual cloud computing in cloud computing Equipment where resource.The computer equipment receives video to be identified by input interface 90, and by being run on processor 92 Unstructured data in above-mentioned various dimensions characteristic information is converted to structural data by image recognition model, and after merging conversion Above-mentioned multiple dimensions characteristic information, obtain fusion structure characteristic information, exported finally by output interface 94 to be identified The video type of video, for example whether for illegal video (such as pornographic video) etc..

It should be noted that the output interface 94 connection can be display, the i.e. display in computer equipment local Upper display can also connect network interface, video type be shown (such as being shown on browser) in socket, but unlimited In the above-mentioned form of expression.

It should be noted that the preferred embodiment of the present embodiment may refer to the associated description in Examples 1 and 2, this Place repeats no more.

Embodiment 6

The present embodiment provides a kind of computer equipments, for providing the interactive interface interacted with user, such as Figure 10 institutes Show, the interactive interface 100 includes：First control 102, for showing that the various dimensions feature of frame image in video to be identified is believed Breath, wherein the various dimensions characteristic information includes the characteristic information of multiple dimensions, the image of the characteristic information use of different dimensions Recognition rule is different；Second control 104 is converted to for showing by the unstructured data in the various dimensions characteristic information Structural data；Third control 106, for showing the structuring spy merged the transformed various dimensions characteristic information and obtained Reference ceases；4th control 108, the video for showing the video to be identified identified according to fusion structure characteristic information Type.

Embodiment 7

Embodiments herein can provide a kind of terminal, which can be in terminal group Any one computer terminal.Optionally, in the present embodiment, above computer terminal can also replace with mobile whole The terminal devices such as end.

Optionally, in the present embodiment, above computer terminal can be located in multiple network equipments of computer network At least one network equipment.

In the present embodiment, above computer terminal can be with following in the recognition methods of the video type of executing application The program code of step：Obtain the various dimensions characteristic information of frame image in video to be identified, wherein the various dimensions characteristic information Include the characteristic information of multiple dimensions, the image recognition rule that the characteristic information of different dimensions uses is different；By the various dimensions Unstructured data in characteristic information is converted to structural data, and merges the feature letter of transformed the multiple dimension Breath, obtains fusion structure characteristic information；The video of the video to be identified is determined according to the fusion structure characteristic information Type.

Optionally, the concrete structure of the terminal in the present embodiment may refer to terminal shown in Fig. 5, but not It is limited to this.

Wherein, memory can be used for storing software program and module, such as the security breaches detection in the embodiment of the present application Corresponding program instruction/the module of method and apparatus, processor are stored in software program and module in memory by operation, To perform various functions application and data processing, that is, realize the detection method of above-mentioned system vulnerability attack.Memory can Can also include nonvolatile memory including high speed random access memory, as one or more magnetic storage device, flash memory, Or other non-volatile solid state memories.In some instances, memory can further comprise remotely setting relative to processor The memory set, these remote memories can pass through network connection to terminal A.The example of above-mentioned network is including but not limited to mutual Networking, intranet, LAN, mobile radio communication and combinations thereof.

Processor can call the information and application program of memory storage by transmitting device, to execute following step： Obtain the various dimensions characteristic information of frame image in video to be identified, wherein the various dimensions characteristic information includes multiple dimensions Characteristic information, the image recognition rule that the characteristic information of different dimensions uses are different；It will be non-in the various dimensions characteristic information Structural data is converted to structural data, and merges the characteristic information of transformed the multiple dimension, obtains fusion structure Change characteristic information；The video type of the video to be identified is determined according to the fusion structure characteristic information.

It will appreciated by the skilled person that terminal can also be smart mobile phone (such as Android phone, IOS mobile phones etc.), tablet computer, applause computer and mobile internet device (Mobile Internet Devices, MID), The terminal devices such as PAD.

One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can To be completed come command terminal device-dependent hardware by program, which can be stored in a computer readable storage medium In, storage medium may include：Flash disk, read-only memory (Read-Only Memory, ROM), random access device (Random Access Memory, RAM), disk or CD etc..

Embodiment 8

The embodiments of the present invention also provide a kind of storage mediums.Optionally, in the present embodiment, above-mentioned storage medium can For preserving the program code performed by the processing method for the network attack that above-described embodiment one is provided.

Optionally, in the present embodiment, above-mentioned storage medium can be located in computer network Computer terminal group In any one terminal, or in any one mobile terminal in mobile terminal group.

Optionally, in the present embodiment, storage medium is arranged to store the program code for executing following steps：It obtains Take the various dimensions characteristic information of frame image in video to be identified, wherein the various dimensions characteristic information includes the spy of multiple dimensions Reference ceases, and the image recognition rule that the characteristic information of different dimensions uses is different；By the non-knot in the various dimensions characteristic information Structure data are converted to structural data, and merge the characteristic information of transformed the multiple dimension, obtain fusion structure Characteristic information；The video type of the video to be identified is determined according to the fusion structure characteristic information.

Optionally, in the present embodiment, storage medium is arranged to store the program code for executing following steps：It is right The characteristic information of each dimension in the various dimensions characteristic information, by the characteristic information of each dimension by unstructured Data are converted to the characteristic value of the frame image according to preset rules, wherein the characteristic value for reflect the frame image with The similarity degree of specified type image.

Optionally, in the present embodiment, storage medium is arranged to store the program code for executing following steps：It presses The characteristic value is determined according to the correspondence of the characteristic information and the characteristic value；Alternatively, according to feature in the frame image The quantity of parameter determines the characteristic value；Alternatively, according to the frame image contextual information and the frame image it is current Characteristic value determines the final characteristic value of the frame image.

The embodiment of the present application additionally provides another storage medium, which is arranged to storage for executing The program code of following steps：The image recognition model that video input to be identified is obtained to advance training, output various dimensions are special Reference ceases fusion structure characteristic information, wherein described image identification model is the multidimensional according to frame image in video to be identified The model that degree characteristic information is trained, the various dimensions characteristic information includes the characteristic information of multiple dimensions, different dimensional The image recognition rule that the characteristic information of degree uses is different；It to be identified regards according to fusion structure characteristic information identification is described The video type of frequency.

Above-mentioned the embodiment of the present application serial number is for illustration only, can not represent the quality of embodiment.

In above-described embodiment of the application, all emphasizes particularly on different fields to the description of each embodiment, do not have in some embodiment The part of detailed description may refer to the associated description of other embodiment.

In several embodiments provided herein, it should be understood that disclosed technology contents can pass through others Mode is realized.Wherein, the apparatus embodiments described above are merely exemplary, for example, the unit division, only A kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can combine or Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Between coupling, direct-coupling or communication connection can be INDIRECT COUPLING or communication link by some interfaces, unit or module It connects, can be electrical or other forms.

The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.

In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.

If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can be stored in a computer read/write memory medium.Based on this understanding, the technical solution of the application is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or network equipment etc.) execute each embodiment the method for the application whole or Part steps.And storage medium above-mentioned includes：USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can to store program code Medium.

The above is only the preferred embodiment of the application, it is noted that for the ordinary skill people of the art For member, under the premise of not departing from the application principle, several improvements and modifications can also be made, these improvements and modifications are also answered It is considered as the protection domain of the application.

Claims

1. a kind of recognition methods of video type, which is characterized in that including：

Obtain the various dimensions characteristic information of frame image in video to be identified, wherein the various dimensions characteristic information includes multiple dimensions The characteristic information of degree, the image recognition rule that the characteristic information of different dimensions uses are different；

Unstructured data in the various dimensions characteristic information is converted into structural data, and is merged transformed described more The characteristic information of a dimension obtains fusion structure characteristic information；

The video type of the video to be identified is determined according to the fusion structure characteristic information.

2. according to the method described in claim 1, it is characterized in that, by the unstructured data in the various dimensions characteristic information Structural data is converted to, including：

For the characteristic information of each dimension in the various dimensions characteristic information, by the characteristic information of each dimension by non- Structural data is converted to the characteristic value of the frame image according to preset rules, wherein the characteristic value is for reflecting the frame The similarity degree of image and specified type image.

3. according to the method described in claim 2, it is characterized in that, by the characteristic information of each dimension by unstructured number According to the characteristic value for being converted to the frame image according to preset rules, including at least one of：

The characteristic value is determined according to the correspondence of the characteristic information and the characteristic value；

Quantity according to characteristic parameter in the frame image determines the characteristic value；

Determine that the frame image is final according to the contextual information of the frame image and the current characteristic value of the frame image Characteristic value.

4. according to the method described in claim 3, it is characterized in that, the contextual information according to the frame image and the frame The current characteristic value of image determines the final characteristic value of the frame image, including：

The characteristic value of front and back frame according to frame where the frame image, the weight of the front and back frame distribution and the frame image Current characteristic value determines the final characteristic value of the frame image.

5. according to the method described in claim 2, it is characterized in that, being waited for according to described in fusion structure characteristic information determination Identify the video type of video, including：

The characteristic value of all frame images in the video to be identified is determined according to fusion structure characteristic information；

Characteristic value according to all frame images determines the characteristic value of the video to be identified；

When the characteristic value of the video to be identified is more than predetermined threshold value, determine that the video type of the video to be identified is specified Type.

6. the method according to any one of claims 1 to 5, it is characterized in that, the various dimensions characteristic information include with At least one lower information：

The characteristic information obtained based on picture recognition model；The characteristic information obtained based on the Text region in picture；With it is default It is more than the picture number of predetermined threshold value in types of database with the frame image similarity.

7. a kind of recognition methods of video type, which is characterized in that including：

The image recognition model that video input to be identified is obtained to advance training, exports various dimensions feature fusion structuring Characteristic information, wherein described image identification model is instructed for the various dimensions characteristic information according to frame image in video to be identified The model got, the various dimensions characteristic information include the characteristic information of multiple dimensions, and the characteristic information of different dimensions uses Image recognition rule it is different；

The video type of the video to be identified is identified according to the fusion structure characteristic information.

8. the method according to the description of claim 7 is characterized in that the structured features information obtained in the following manner Characteristic information：

For the characteristic information of each dimension in the various dimensions characteristic information, by the characteristic information of each dimension by non- Structural data is converted to the characteristic value of the frame image according to preset rules, and according to each frame figure in the video to be identified The feature of picture is worth to the characteristic value of the video to be identified, wherein the characteristic value of the frame image is for reflecting the frame figure As the similarity degree with specified type image.

9. a kind of computer equipment, which is characterized in that including：

Input interface, the various dimensions characteristic information for receiving frame image in the video to be identified inputted, wherein the various dimensions Characteristic information includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；

Processor for the unstructured data in the various dimensions characteristic information to be converted to structural data, and merges and turns The characteristic information of the multiple dimension after changing, obtains fusion structure characteristic information；And according to the fusion structure feature Information determines the video type of the video to be identified；

Output interface, for exporting the video type.

10. a kind of computer equipment, for providing the interactive interface interacted with user, which is characterized in that the interactive boundary Face includes：

First control, the various dimensions characteristic information for showing frame image in video to be identified, wherein the various dimensions feature letter Breath includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；

Second control, for showing the structuring number being converted to by the unstructured data in the various dimensions characteristic information According to；

Third control, for showing the structured features information for merging the transformed various dimensions characteristic information and obtaining；

4th control, the video type for showing the video to be identified identified according to fusion structure characteristic information.

11. a kind of identification device of video type, which is characterized in that including：

Acquisition module, the various dimensions characteristic information for obtaining frame image in video to be identified, wherein the various dimensions feature letter Breath includes the characteristic information of multiple dimensions, and the image recognition rule that the characteristic information of different dimensions uses is different；

Conversion module for the unstructured data in the various dimensions characteristic information to be converted to structural data, and merges The characteristic information of transformed the multiple dimension, obtains fusion structure characteristic information；

Identification module, the video type for determining the video to be identified according to the fusion structure characteristic information.

12. a kind of recognition methods of video type, which is characterized in that including：

The structured features information of the frame image of video to be identified is extracted, the structured features information is for reflecting the frame figure As the similarity degree with specified type image；

The text information in video frame is extracted using picture character identification technology, and utilizes the dictionary transformation side based on Gauss weight Non-structured text message is converted to first structure information by method；

The index database established using the picture of specified type, the investigation that sequence of frames of video is extracted with investigation technology index planting modes on sink characteristic, Then non-structured text message is converted to the second structured message with the dictionary transform method based on Gauss weight；

The structured features information, first structure characteristic information and the second structured features information extracted are merged, is obtained To fusion structure characteristic information；It is trained using the fusion structure characteristic information, obtains final video identification mould Type；

Will be by video input to be identified to the video identification model, and export the recognition result to the video to be identified.