CN107622333A - A kind of event prediction method, apparatus and system - Google Patents
A kind of event prediction method, apparatus and system Download PDFInfo
- Publication number
- CN107622333A CN107622333A CN201711064205.6A CN201711064205A CN107622333A CN 107622333 A CN107622333 A CN 107622333A CN 201711064205 A CN201711064205 A CN 201711064205A CN 107622333 A CN107622333 A CN 107622333A
- Authority
- CN
- China
- Prior art keywords
- data
- text data
- text
- characteristic vector
- event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of event prediction method, apparatus and system.Method includes:Obtain the text data in social network data;Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model is used to determine that the text data corresponds to the probability of suspicious event characterized by the characteristic vector of the text data.The application is by capturing the social network data of magnanimity, and natural language processing is carried out to text data therein, text data is predicted therefrom to find the characteristic vector as key influence factor, and based on effect characteristicses, to reach the purpose of Accurate Prediction suspicious event.
Description
Technical field
The application is related to field of computer technology, more particularly to a kind of event prediction method, apparatus and system.
Background technology
With the development of Internet technology, crime and attack of terrorism means are also more and more intelligent.Many terroristic organizations live
Jump in internet, with the organizational planning attack of terrorism.
Prior art is usually after being carried out after the generation of the event such as crime and the attack of terrorism according to the analysis to netizen's emotion
It is continuous to pacify work.Such as:After the generation of some event, relevant department have studied the public sentiment data being the theme with the event, and base
National emotion is analyzed in public sentiment data.But the mode of this " event arranges again after occurring " is can not to prevent event
's.
Accordingly, it is desirable to provide the scheme of dependent event generation can be prevented.
The content of the invention
It is unpredictable for solving prior art that the embodiment of the present application provides a kind of event prediction method, apparatus and system
The problem of event occurs.
The embodiment of the present application also provides a kind of event prediction method, including:
Obtain the text data in social network data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;
The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the text data
Characteristic vector be characterized the probability for determining that the text data corresponds to suspicious event.
Optionally, before the text data in obtaining social network data, in addition to:
Social network data is obtained from social networks;
Non-text data in unstructured data in the social network data is converted into text data.
Optionally, it is described to the text data carry out vectorization processing, obtain feature corresponding to the text data to
Amount includes:
Vectorization processing is carried out to the word in the text data, obtains term vector corresponding to institute's predicate;
Term vector corresponding to word in the text data, determine characteristic vector corresponding to the text data.
Optionally, vectorization processing is carried out to the word in the text data, obtains term vector bag corresponding to institute's predicate
Include:
Based on the word in text data described in text depth representing model training, the output of text depth representing model is obtained
Term vector.
Optionally, term vector corresponding to the word in the text data is obtained corresponding to the text data
Characteristic vector includes:
The calculating averaged to term vector corresponding to the word in the text data, and using result of calculation as
Characteristic vector corresponding to the text data.
Optionally, inputted using the characteristic vector as feature to before pre-established disaggregated model, in addition to:
Obtain the user behavior data associated in the social network data with the text data;
Feature selecting processing, characteristic variable corresponding to acquisition are carried out to the user behavior data;
Wherein, the characteristic vector is inputted to pre-established disaggregated model as feature includes:
The characteristic vector and the characteristic variable are inputted to pre-established disaggregated model as feature.
Optionally, described to carry out feature selecting processing to the user behavior data, obtaining correlated variables includes:
Determine the variable in the user behavior data;
The variable is scored based on predetermined Method for Feature Selection, to determine the variable to the text data
The disturbance degree of corresponding event;
The variable that disturbance degree meets predetermined condition is chosen from the variable in the user behavior data, is become as feature
Amount.
Optionally, the predetermined Method for Feature Selection is filtering type Method for Feature Selection, packaging type Method for Feature Selection, integrated
It is at least one in formula Method for Feature Selection.
Optionally, the user behavior data includes:Solid data and/or label data, the solid data are used for table
Show the set of the data related to text data, the label data is used to represent word pair in text data or text data
Data corresponding to the label and label answered.
Optionally, after the prediction result of the disaggregated model output is obtained, in addition to:
The suspicious probability of the entity related to the text data is determined according to the prediction result.
Optionally, inputted using the characteristic vector as feature to before pre-established disaggregated model, in addition to:
Sample data is obtained, the sample data includes:Sample event, and textual data corresponding to the sample event
According to and/or user behavior data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;It is and/or right
The user behavior data carries out feature selecting processing, obtains characteristic variable corresponding with the user behavior data;
Characterized by characteristic vector corresponding to sample event and/or characteristic variable, disaggregated model is established.
Optionally, the disaggregated model be the disaggregated model based on Bayes, the disaggregated model based on SVMs,
It is disaggregated model based on convolutional neural networks, at least one in the disaggregated model based on Recognition with Recurrent Neural Network.
The embodiment of the present application also provides a kind of event prediction device, including:
First acquisition unit, for obtaining the text data in social network data;
First processing units, for carrying out vectorization processing to the text data, obtain corresponding to the text data
Characteristic vector;
Second processing unit, used for the characteristic vector to be inputted to pre-established disaggregated model, the disaggregated model
In determining that the text data corresponds to the probability of suspicious event characterized by the characteristic vector of the text data.
Optionally, in addition to:
Second acquisition unit, for obtaining the user behavior associated in the social network data with the text data
Data;
Wherein, first processing units, it is additionally operable to carry out feature selecting processing to the user behavior data, obtains corresponding
Characteristic variable;
The second processing unit, it is additionally operable to as feature input the characteristic vector and the characteristic variable to pre-
The disaggregated model of foundation.
The embodiment of the present application also provides a kind of event prediction system, including:Data warehouse, kafka clusters and storm collection
Group, wherein:
The data warehouse, for storing social network data, and the producer for the kafka clusters provides social activity
Network data;
The kafka clusters, for being pre-processed to the social network data, to extract the social networks number
Text data and/or user behavior data in;
The storm clusters, for calling the event prediction device described in claim 13 or 14, with described in consumption
Text data and/or user behavior data in kafka clusters, probability of the output corresponding to suspicious event.
The embodiment of the present application also provides a kind of event prediction device, including:Memory and processor, wherein:
Memory, for depositing program;
Processor, for performing the program of the memory storage, and specifically perform:
Obtain the text data in social network data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;
The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the text data
Characteristic vector be characterized the probability for determining that the text data corresponds to suspicious event.
The embodiment of the present application also provides a kind of computer-readable recording medium, the computer-readable recording medium storage
One or more programs, one or more of programs are when the electronic equipment for being included multiple application programs performs so that institute
State electronic equipment and perform following methods:
Obtain the text data in social network data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;
The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the text data
Characteristic vector be characterized the probability for determining that the text data corresponds to suspicious event.
Above-mentioned at least one technical scheme that the embodiment of the present application uses can reach following beneficial effect:
Natural language processing is carried out by capturing the social network data of magnanimity, and to text data therein, with therefrom
The characteristic vector as key influence factor, and the input using characteristic vector as disaggregated model are found, to enter to text data
Row prediction, reach the purpose of Accurate Prediction suspicious event.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen
Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 1 provides;
Fig. 2 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 2 provides;
Fig. 3 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 3 provides;
Fig. 4 is the schematic diagram for the text depth representing model word2vec that the embodiment of the present application 3 provides;
Fig. 5 is the schematic diagram for the Recognition with Recurrent Neural Network RNN that the embodiment of the present application 3 provides;
Fig. 6 is the structural representation for the event prediction device that the embodiment of the present application 4 provides;
Fig. 7 is the structural representation for the event prediction device that the embodiment of the present application 5 provides;
Fig. 8 is the structural representation for the event prediction system that the embodiment of the present application 6 provides;
Fig. 9 is the structural representation for a kind of electronic equipment that the embodiment of the present application 7 provides.
Embodiment
To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with the application specific embodiment and
Technical scheme is clearly and completely described corresponding accompanying drawing.Obviously, described embodiment is only the application
Part of the embodiment, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not having
There is the every other embodiment made and obtained under the premise of creative work, belong to the scope of the application protection.
Below in conjunction with accompanying drawing, the technical scheme that each embodiment of the application provides is described in detail.
Embodiment 1
Fig. 1 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 1 provides, referring to Fig. 1, this method
Specifically it may include steps of:
Text data in step 120, acquisition social network data;
It should be noted that a kind of implementation of step 120 can be:
First, social network data is captured from social platform, then, the social network data of crawl is converted,
The pretreatments such as cleaning, parsing, classification, distinguish structural data and unstructured data therein, and unstructured data
In text data.When carrying out event prediction, therefrom obtain corresponding to text data.
Another implementation of step 120 can be:
First, social network data is captured from social platform, then, the data of crawl is converted, clean, solved
The pretreatments such as analysis, classification, distinguish the text in structural data and unstructured data, and unstructured data therein
Notebook data and non-text data, then, the non-text data such as the picture in unstructured data, audio, video are converted
For text data.When carrying out event prediction, therefrom obtain corresponding to text data.Wherein, non-text data is converted into text
Technology used in notebook data includes the existing correlation technique by wechat audio identification for word, or, by the captions of video
File translations are text data etc..
In addition, it can be wechat, qq, push away spy, facebook etc. for the social platform referred in above two implementation
Deng;The instrument for capturing social network data can be web crawlers etc.;Text data can be specially one section of dialogue, text
Shelves, one notice etc., correspondingly, text data is corresponding with event, such as.
Step 140, vectorization processing is carried out to the text data, obtain characteristic vector corresponding to the text data;
It should be noted that because characteristic vector is the input of disaggregated model, accordingly, it is determined that can the characteristic vector that go out
Represent the prediction result that text data directly influences model output.Based on this, a kind of implementation of step 140 can be:
First, vectorization processing is carried out to the word in the text data, obtains term vector corresponding to institute's predicate;Then,
Term vector corresponding to word in the text data obtains characteristic vector corresponding to the text data.
In this implementation, vectorization processing can be for using text depth representing model-word2vec instruments training institute
The word in text data is stated, obtains the term vector of text depth representing model output.Then, to the word in the text data
Corresponding term vector is averaged, and using the vector of acquisition as characteristic vector corresponding to the text data.Wherein,
The core concept of text depth representing model is:By training, the processing to text data is reduced in K gts
Vector operation, and the similarity in vector space can be used for representing similarity on text semantic.
Step 160, the characteristic vector inputted to pre-established disaggregated model, the disaggregated model is used for described
The characteristic vector of text data is characterized the probability for determining that the text data corresponds to suspicious event.
It should be noted that before above-mentioned steps 120- steps 160 are performed, it is also necessary to the step of model is established in execution,
Specifically it may include steps of:
First, sample data is obtained, the sample data includes:Sample event, and text corresponding to the sample event
Notebook data;Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;With sample thing
Characteristic vector is characterized corresponding to part, establishes disaggregated model.Disaggregated model based on foundation, new text data is carried out pre-
Survey.
Understandable to be, sample event includes positive sample and negative sample, and positive sample is the thing related to suspicious event
Part, such as:Terrorist incident etc., correspondingly, its corresponding text data can be the conversation content of terrorist, action planning
Information, crime route etc..
In order to improve the precision of the model of foundation, the embodiment of the present application characterized by characteristic vector corresponding to text data,
4 kinds of disaggregated models are established to exemplary, including:Disaggregated model based on Bayes, the classification mould based on SVMs
Type, the disaggregated model based on convolutional neural networks, the disaggregated model based on Recognition with Recurrent Neural Network.
It can be seen that the embodiment of the present application is by capturing the social network data of magnanimity, and it is based on depth text representation model pair
Text data therein carries out natural language processing, therefrom to find the characteristic vector as key influence factor, and by feature
Input of the vector as the model established based on deep learning, to be predicted to text data, reaches the suspicious thing of Accurate Prediction
The purpose of part.
Embodiment 2
Fig. 2 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 2 provides, referring to Fig. 2, this method
Specifically it may include steps of:
Step 220, obtain the text data in social network data and the user behavior number associated with the text data
According to;
It should be noted that social network data includes structural data and unstructured data.Wherein, user behavior
Data belong to structural data, including:Solid data and/or label data, the solid data is used to represent and text data
The set of related data, the label data be used to represent in text data or text data label corresponding to word and
Data corresponding to label.With one section of dialogue for text data, its corresponding solid data is exemplified as:It is the participation entity of dialogue, right
The scene of words and time etc., and participate in the behavioral datas such as the web page browsing, search, click of entity personage A progress.
Label data is exemplified as:Article B corresponding to the word occurred in dialogue, property corresponding to the classification and the category belonging to article B
Matter, such as:Whether belong to for contraband etc..
Step 240, vectorization processing is carried out to the text data, obtain characteristic vector corresponding to the text data;
It should be noted that step 240 is similar to the step 140 in embodiment 1, therefore, no longer step 240 is opened up herein
Open explanation.
Step 260, feature selecting processing, characteristic variable corresponding to acquisition are carried out to the user behavior data;
It should be noted that a kind of implementation of step 260 can be:
First, the variable in the user behavior data is determined, variable can be:Participate in main body, time, place, wherein
Article being related to etc.;Then, the variable is scored based on predetermined Method for Feature Selection, to determine the variable pair
The disturbance degree of event corresponding to the text data;Disturbance degree is chosen from the variable in the user behavior data to meet in advance
The variable of fixed condition, as characteristic variable.
In this implementation, the disturbance degree of variable is smaller, thinks that its influence to user behavior data is smaller, such as:
For the article " water " occurred in data, it is typically not considered as having what relation with suspicious event, therefore, it is to user behavior
The influence of data is smaller;It is and then on the contrary for " pistol ", " antitank grenade ", " firearms model " etc..
In order to improve the precision of the characteristic variable of selection, the Method for Feature Selection that the embodiment of the present application proposes can be specially
In filtering type filter Method for Feature Selection, packaging type Wrapper Method for Feature Selection, integrated form Embedded Method for Feature Selection
It is at least one.It selects remote include:(1) with filter methods to drawing variable score by calculating coefficient correlation and chi-square value.
(2) decision Tree algorithms are based on using recursive back-and-forth method forward to give a mark to each variable.(3) returned by lasso plus decision tree enters
Row variables choice, penalty term is introduced, by the coefficient boil down to 0 of Partial Variable.The result rationally chosen using above-mentioned three kinds of methods
One or more of, show that final mask needs the characteristic variable adopted.For example, the variable in a text data includes:
Sex, age, geographical position, equipment for surfing the net, online duration, message hop count etc., can by features described above system of selection
Therefrom to select the factor for significantly affecting the suspicious probability of text data, as characteristic variable.
Wherein, the principle of each Method for Feature Selection is as follows:
Filter methods:For continuous variable, variance back-and-forth method selection variance can be used to be more than the variable of certain threshold value,
The coefficient correlation of characteristic variable and target variable can be calculated.In the case of characteristic variable and target variable are all qualitative variables,
The correlation between variable can be portrayed using Chi-square Test or mutual information.
Wrapper methods:Using the performance of learning algorithm come the quality of evaluating characteristic subset.Wrapper methods need training one
Individual learner, character subset is selected according to the performance of learner, available algorithm includes decision tree, neutral net, KNN
Deng.
Embedded methods:Integration Method refers to feature selecting algorithm and learning algorithm being integrated together, and is such as based on lasso
Carry out variables choice and variables choice is carried out based on tree-model.
Step 280, the characteristic vector and the characteristic variable are inputted to pre-established disaggregated model as feature, and
The prediction result of the disaggregated model output is obtained, the prediction result is used to represent that the text data corresponds to suspicious thing
The probability of part.
It should be noted that it is similar to the associated description of step 160 to embodiment 1, before step 280 is carried out, together
The step of sample needs to carry out establishing disaggregated model, is specifically as follows:
Sample data is obtained, the sample data includes:Sample event, and textual data corresponding to the sample event
According to and/or user behavior data;Vectorization processing is carried out to the text data, obtains feature corresponding to the text data
Vector;And/or feature selecting processing is carried out to the user behavior data, obtain corresponding with the user behavior data special
Levy variable;Characterized by characteristic vector corresponding to sample event and/or characteristic variable, disaggregated model is established.
In addition, after the prediction of suspicious event is completed, the application can also be further to can corresponding to suspicious event
Doubtful entity is predicted.It is specifically as follows:
It is determined that text data correspond to suspicious event probability after, if probability meets predetermined standard, to the text
The related data for the entity (personage) that data are related to is predicted, and further to excavate crime club etc., reaches further
Improve the effect that prevention event occurs.Wherein, the related data of entity can be:Essential information, relative social number
According to, its local and overseas deed etc..Analysis for suspicious entity, can be on the basis of the prediction to its social data, further
Its deed, whereabouts etc. are analyzed, to predict the suspicious degree of entity from multiple dimensions.
It can be seen that the embodiment of the present application considers the characteristic vector of text message and the spy of corresponding user behavior data
The feature of two angles of variable is levied, carries out the prediction of suspicious event, on the basis of embodiment 1, further to improve prediction
Precision.
Embodiment 3
Fig. 3 is a kind of schematic flow sheet for event prediction method that the embodiment of the present application 3 provides, referring to Fig. 3, below from
The application is described in detail the angle of example:
Step 320, social network data is captured from social platform
Social network data is pre-processed, obtains training data, training data includes:Message-text (textual data
According to), solid data and label data.Wherein, the process of pretreatment is in Examples 1 and 2 to be described, therefore, herein not
Repeat again.
The vectorization of step 340, Message-text represents
The expression of term vector through the most important task of natural language processing, in order to preferably complete it is most of from
Right language processing tasks are, it is necessary to similarity and difference between defined terms and word.The present embodiment using word2vec train word to
Amount, word2vec has two basic models, is CBOW term vectors model and Skip-gram term vector models respectively, referring to
Fig. 4, explanation calculates the process of term vector by taking Skip-gram as an example:
Skip-gram models are a three-layer neural networks, input of the single word w (t) as model, by hidden layer
Most Zhongdao softmax layers draw the word w (t-2) of the word context, w (t-1), w (t+1), w (t+2) probability and corresponding hidden
The weighted value of layer is hidden, as the term vector for trying to achieve word w (t).
Based on word2vec corpus related to terrorism, the term vector in training corpus.But due to every message
The word number of text differs, such as:I have a book. the words has 4 words, and this 4 words are represented by term vector respectively,
Therefore, there are 4 term vectors.To enable the words to be represented by a term vector, the simple average of 4 term vectors can be taken
Represent.In this way, each Message-text is represented with a vector, and disaggregated model is made after being easy to.
The variables choice of step 360, solid data and label data
Because solid data and label data dimension are higher, therefore, it is necessary to chosen with the method for Feature Engineering is to event
The no suspicious variable having a significant impact, so that modelling effect is optimal.Solid data and label data need to input
Whether eigenmatrix, event can be suspected of the object vectors for needing input.For continuous variable, it is necessary to standardize and normalizing
Change is handled, and for classifying type variable, it is necessary to carry out dummy variable coding, some missing values need to be handled with interpolation.Feature is selected
Selecting conventional method includes Filter methods, Wrapper methods, Embedded and dimension reduction method.Because being related to a large amount of societies in this patent
Friendship and user behavior data, it is therefore desirable to carry out feature selecting.
In addition, the process of variables choice is corresponding with the description as described in step 260 in embodiment 2, therefore, herein no longer
Repeat.
Step 380, establish disaggregated model
After obtaining the feature that Message-text vector sum chooses, two disaggregated models can be established.Because target variable is positive and negative
There is extreme imbalance problem in class, therefore, it is necessary to handle unbalanced data, conventional method includes oversampling,
SMOTE etc..Next the models such as Random forest, Logistc regression, SVM are attempted, first by data set point
For training set and test set, using sklearn training patterns, the accuracy obtained under different models is compared, model is commented
The method estimated includes hold-out, cross validation, TPR, TNR etc..
In addition to conventional machines learning classification model, this patent has also been attempted to classify to text with deep learning model,
Different from traditional feed-forward neutral net, RNN introduces directed circulation, and forward-backward correlation is asked between can handling input
Topic.
Referring to Fig. 5, RNN is used for handling sequence data, in traditional neural network, the node between every layer be it is connectionless,
But in natural language processing, front and rear word is not independent in sentence, RNN can be remembered and answered to information above
For in the calculating that currently exports, i.e., the node between hidden layer to be no longer connectionless but has connection, and hidden layer is defeated
Entering the not only output including input layer also includes the output of last moment hidden layer.Show that Message-text whether may be used based on RNN
Doubt.
New Message-text is predicted using the model established, help intelligence analysis personnel do decision-making so as to and
When take precautions against the attack of terrorism.
Step 3100, modelling effect analysis
First, selected using above-mentioned feature selection approach to the whether suspicious variable having a significant impact of event, it is main
Including:Predominantly structural data, including spot, weapon type, target, the spot historical events number
Amount etc..Term vector is trained by word2vec, and text classification is carried out with the term vector trained.
Totally 60000, model training sample, wherein terrorist incident 120, are consequently belonging to classification height unbalanced data.
In modeling process, class imbalance problem is adjusted using SMOTE algorithms.Two scenes are attempted respectively:1. merely with term vector
Message-text is classified as aspect of model variable.2. some structural data variables are additionally included, with term vector in the lump
As feature input model.Above-mentioned two scene is attempted respectively to establish machine learning and deep learning model, uses cross validation
Method carries out model selection, and computation model overall accuracy, accuracy rate, recall rate are as a result as follows respectively:
Only feature input model is used as by the use of term vector:
First attempt to only classify to short message using term vector as characteristic variable.Mainly two kinds of engineerings are attempted
Practise model (naive BayesianBayes+ support vector machines) and two kinds of deep learning model (convolutional neural networks
CNN+ Recognition with Recurrent Neural Network RNN), class is aligned respectively and negative class sample randomly selects 1/3 as test set and is used for model evaluation,
Positive class/negative class sample and terrorist incident and its related message text data/non-terrorist incident and its related Message-text number
According to corresponding.The very negative rate of model accuracy accuracy, real rate TPR, TNR is calculated respectively, due to the positive and negative class height of data not
Balance, therefore TPR and TNR have been considered here, G-means is calculated as final judgment criteria.
Model accuracy of the table 1 by the use of term vector as feature
It can be seen from the results above that in the case where considering the nicety of grading of positive class and negative class, except SVM essence
Spend outside poor, other three modelling effects are pretty good.
Binding characteristic variable is as feature input model:
Secondly, with reference to some structural datas, such as spot, weapon type, target, short text is carried out
Classification, with to the further lifting of model accuracy.As a result such as following table:
The model accuracy that table 2 is added after affair character
From the point of view of result above, add after the variable of description affair character, modelling effect has to be lifted by a small margin, is led to
Integrated comparative is crossed, finally have chosen the RNN models for including affair character variable as final disaggregated model.
Step 3120, based on disaggregated model new events are predicted
The prediction carried out to new events is similar to the description in Examples 1 and 2, therefore, here is omitted.
It should be noted that the executive agent that embodiment 1-3 provides each step of method may each be same equipment,
Or this method is also by distinct device as executive agent.For example the executive agent of step 120 and step 140 can be to set
Standby 1, the executive agent of step 160 can be equipment 2;Again for example, the executive agent of step 120 can be equipment 1, step 140
Executive agent with step 160 can be equipment 2;Etc..
In addition, for above method embodiment, in order to be briefly described, therefore it is all expressed as to a series of action group
Close, but those skilled in the art should know, embodiment of the present invention is not limited by described sequence of movement, because
For according to embodiment of the present invention, some steps can use other orders or carry out simultaneously.Secondly, people in the art
Member should also know that embodiment described in this description belongs to preferred embodiment, and involved action might not
Necessary to being embodiment of the present invention.
Embodiment 4
Fig. 6 is the structural representation for the event prediction device that the embodiment of the present application 4 provides, and referring to Fig. 6, the device includes:
First acquisition unit 61, first processing units 62 and second processing unit 63, wherein:
First acquisition unit 61, for obtaining the text data in social network data;
First processing units 62, for carrying out vectorization processing to the text data, it is corresponding to obtain the text data
Characteristic vector;
Second processing unit 63, for the characteristic vector to be inputted to pre-established disaggregated model, the disaggregated model
For determining that the text data corresponds to the probability of suspicious event characterized by the characteristic vector of the text data.
Wherein, the operation principle of first processing units 62 is briefly described:
First processing units 62 are used to carry out vectorization processing to the word in the text data, and it is corresponding to obtain institute's predicate
Term vector;Term vector corresponding to word in the text data obtains characteristic vector corresponding to the text data.Tool
Body:Based on the word in text data described in text depth representing model training, the word that text depth representing model exports is obtained
Vector.Term vector corresponding to word in the text data is averaged, and using the vector of acquisition as the text
Characteristic vector corresponding to data.
It can be seen that the embodiment of the present application is by capturing the social network data of magnanimity, and it is based on depth text representation model pair
Text data therein carries out natural language processing, therefrom to find the characteristic vector as key influence factor, and by feature
Input of the vector as the model established based on deep learning, to be predicted to text data, reaches the suspicious thing of Accurate Prediction
The purpose of part.
Embodiment 5
Fig. 7 is the structural representation for the event prediction device that the embodiment of the present application 5 provides, and referring to Fig. 6, the device includes:
First acquisition unit 71, second acquisition unit 72, first processing units 73 and second processing unit 74, wherein:
First acquisition unit 71, for obtaining the text data in social network data;
Second acquisition unit 72, for obtaining the user's row associated in the social network data with the text data
For data;
First processing units 73, for carrying out vectorization processing to the text data, it is corresponding to obtain the text data
Characteristic vector;Feature selecting processing, characteristic variable corresponding to acquisition are carried out to the user behavior data;
Second processing unit 74, for the characteristic vector and the characteristic variable to be inputted to pre-established as feature
Disaggregated model.
Wherein, first processing units 73 are used to determine the variable in the user behavior data;Selected based on predetermined feature
Select method to score to the variable, to determine disturbance degree of the variable to event corresponding to the text data;From described
The variable that disturbance degree meets predetermined condition is chosen in variable in user behavior data, as characteristic variable.
It can be seen that the embodiment of the present application considers the characteristic vector of text message and the spy of corresponding user behavior data
The feature of two angles of variable is levied, the prediction of suspicious event is carried out, can further improve the precision of prediction.
Embodiment 6
Fig. 8 is the structural representation for the event prediction system that the embodiment of the present application 6 provides, and referring to Fig. 8, the system includes:
Data warehouse 81, kafka clusters 82 and storm clusters 83, wherein:
The data warehouse 81, for storing social network data, and the producer for the kafka clusters provides society
Hand over network data;
The kafka clusters 82, for being pre-processed to the social network data, to extract the social networks
Text data and/or user behavior data in data;
The storm clusters 83, for calling event prediction device corresponding to embodiment 5 or 6, to consume the kafka
Text data and/or user behavior data in cluster, probability of the output corresponding to suspicious event.
It should be noted that the operation principle of system is as follows:
Capture full dose social network data (twitter and facebook) and carry out ETL processing, and by the data after processing
According to the data warehouse module loading predefined into data warehouse.Newly-increased message is handled by kafka clusters.
Corresponding entity and label data are obtained based on message content.By Message-text be converted to the term vector of structuring and entity and
Label data carries out feature selecting and finds out the suspicious key factor of influence event together.Establish machine learning and deep learning mould
Type, the suspicious of unknown information is predicted.
Wherein, external data (social media data) is parsed and cleaned into data warehouse, then is entered from data warehouse
Enter Kafka, consumer pulls real-time increased message data from Broker, with reference to a hive (data based on Hadoop
Warehouse instrument, the data file of structuring can be mapped as to a database table, and simple sql query functions are provided, can
Run so that sql sentences are converted into MapReduce tasks) in solid data and label data, call packaged calculation
Method bag carries out that the suspicious probability of event is calculated, and because message data is newly-increased data in real time, the suspicious probability of message uses
Storm frameworks carry out streaming computing, and external data flows into Storm through Spout by Kafka in the form of Tuple and calculates collection in real time
Group, gives the Topology processing in cluster, in Topology each the bolt of node as a specific task,
The packaged algorithm bag of parallel calling carries out the calculating of the suspicious probability of event, is finally deposited result of calculation by last bolt
Enter mysql.
The Kafka that the present embodiment uses is that a kind of distributed post of high-throughput subscribes to message system, and it can be handled
Everything flow data in the website of consumer's scale.It is this action (web page browsing, search and other users action) be
One key factor of many social functions on modern network.These data are often as the requirement of handling capacity and led to
Processing daily record and log aggregation are crossed to solve.For the daily record data as Hadoop and off-line analysis system, but again
The limitation handled when realistic, this is a feasible solution.Kafka purpose is the loaded in parallel machine by Hadoop
System unifies Message Processing on line and offline, also for providing real-time consumption by cluster.
In addition, the system uses the real-time Computational frames of Storm, there is the characteristics of low latency, high-performance, Distributed Calculation,
Therefore the recognition result that can provide intelligence in time is analyzed for intelligence agent.In addition, this patent make use of extensive social network
Network data and internet behavior data analysis terrorist attacks feature, unified with nature language processing techniques, can reach automatic knowledge
The purpose of other suspicious event.
For said apparatus embodiment, because it is substantially similar to method embodiment, so the comparison of description
Simply, related part illustrates referring to the part of method embodiment.It should be noted that in each of device of the invention
In individual part, logical partitioning is carried out to part therein according to the function that it to be realized, still, the present invention is not only restricted to
All parts can be repartitioned or combined as needed by this.
Embodiment 7
Fig. 9 is the structural representation for a kind of electronic equipment that the embodiment of the present application 7 provides, referring to Fig. 9, the electronic equipment
Including:Processor, internal bus, network interface, internal memory and nonvolatile memory, other business are also possible that certainly
Required hardware.Processor read from nonvolatile memory corresponding to computer program into internal memory then run,
Event prediction device is formed on logic level.Certainly, in addition to software realization mode, the application is not precluded from other realizations
Mode, such as mode of logical device or software and hardware combining etc., that is to say, that the executive agent of following handling process is not
It is defined in each logic unit or hardware or logical device.
Network interface, processor and memory can be connected with each other by bus system.Bus can be ISA
(Industry Standard Architecture, industry standard architecture) bus, PCI (Peripheral
Component Interconnect, Peripheral Component Interconnect standard) bus or EISA (Extended Industry Standard
Architecture, EISA) bus etc..The bus can be divided into address bus, data/address bus, control
Bus etc..For ease of representing, only represented in Fig. 9 with a four-headed arrow, it is not intended that an only bus or a type
Bus.
Memory is used to deposit program.Specifically, program can include program code, and described program code includes calculating
Machine operational order.Memory can include read-only storage and random access memory, and provide instruction sum to processor
According to.Memory may include high-speed random access memory (Random-Access Memory, RAM), it is also possible to also including non-
Volatile memory (non-volatile memory), for example, at least 1 magnetic disk storage.
Processor, for performing the program of the memory storage, and specifically perform:
Obtain the text data in social network data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;
The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the text data
Characteristic vector be characterized the probability for determining that the text data corresponds to suspicious event.
Above-mentioned event prediction device device or manager as disclosed in the application Fig. 1-2 and embodiment illustrated in fig. 6
(Master) method that node performs can apply in processor, or be realized by processor.Processor is probably a kind of collection
Into circuit chip, the disposal ability with signal.In implementation process, each step of the above method can be by processor
Hardware integrated logic circuit or software form instruction complete.Above-mentioned processor can be general processor, including
Central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP) etc.;
It can also be digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit
(Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-
Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic device
Part, discrete hardware components.It can realize or perform disclosed each method, step and the logic diagram in the embodiment of the present application.
General processor can be microprocessor or the processor can also be any conventional processor etc..It is real with reference to the application
The step of applying the method disclosed in example can be embodied directly in hardware decoding processor and perform completion, or use decoding processor
In hardware and software module combination perform completion.Software module can be located at random access memory, and flash memory, read-only storage can
In the ripe storage medium in this area such as program read-only memory or electrically erasable programmable memory, register.The storage
Medium is located at memory, and processor reads the information in memory, with reference to the step of its hardware completion above method.
Event prediction device device can also carry out Fig. 1 method, and realize the method that manager's node performs.
Based on identical innovation and creation, the embodiment of the present application also provides a kind of computer-readable recording medium, and computer can
Read storage medium and store one or more programs, one or more of programs are set when the electronics for being included multiple application programs
During standby execution so that the electronic equipment performs following methods:
Obtain the text data in social network data;
Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;
The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the text data
Characteristic vector be characterized the probability for determining that the text data corresponds to suspicious event.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or computer journey
Sequence product.Therefore, in terms of the present invention can use complete hardware embodiment, complete software embodiment or combine software and hardware
The form of embodiment.Moreover, the present invention can use the calculating for wherein including computer usable program code in one or more
The computer program that machine usable storage medium is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The present invention is the flow with reference to method according to embodiments of the present invention, equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be by each in computer program instructions implementation process figure and/or block diagram
Flow and/or the flow in square frame and flow chart and/or block diagram and/or the combination of square frame.These computers can be provided
Processor of the programmed instruction to all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices
To produce a machine so that produced by the instruction of computer or the computing device of other programmable data processing devices
For realizing the function of being specified in one flow of flow chart or multiple flows and/or one square frame of block diagram or multiple square frames
Device.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included
The manufacture of command device, the command device are realized in one flow of flow chart or multiple flows and/or one square frame of block diagram
Or the function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that
Series of operation steps is performed on computer or other programmable devices to produce computer implemented processing, so as to calculate
The instruction performed on machine or other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or side
The step of function of being specified in one square frame of block diagram or multiple square frames.
In a typical configuration, computing device include one or more processors (CPU), input/output interface,
Network interface and internal memory.
Internal memory may include computer-readable medium in volatile memory, random access memory (RAM) and/
Or the form such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any side
Method or technology realize that information stores.Information can be computer-readable instruction, data structure, the module of program or other numbers
According to.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM
(SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read-only storage
(ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc are read-only
Memory (CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk
Or other magnetic storage apparatus or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Press
Defined according to herein, computer-readable medium does not include temporary computer readable media (transitory media), such as modulates
Data-signal and carrier wave.
It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability
Comprising so that process, method, commodity or equipment including a series of elements not only include those key elements, but also wrapping
Include the other element being not expressly set out, or also include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including described
Other identical element also be present in the process of key element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program production
Product.Therefore, the application can use the implementation in terms of complete hardware embodiment, complete software embodiment or combination software and hardware
The form of example.Moreover, the application can use the computer for wherein including computer usable program code in one or more can
With the computer program product implemented in storage medium (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Form.
Embodiments herein is the foregoing is only, is not limited to the application.For those skilled in the art
For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, etc.
With replacement, improvement etc., should be included within the scope of claims hereof.
Claims (11)
- A kind of 1. event prediction method, it is characterised in that including:Obtain the text data in social network data;Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;The characteristic vector is inputted to pre-established disaggregated model, the disaggregated model and is used for the feature of the text data Vector is characterized the probability for determining that the text data corresponds to suspicious event.
- 2. according to the method for claim 1, it is characterised in that before the text data in obtaining social network data, Also include:Social network data is obtained from social networks;Non-text data in unstructured data in the social network data is converted into text data.
- 3. according to the method for claim 1, it is characterised in that it is described that vectorization processing is carried out to the text data, obtain Characteristic vector corresponding to the text data is taken to include:Based on the word in text data described in text depth representing model training, obtain the word of text depth representing model output to Amount.The calculating averaged to term vector corresponding to the word in the text data, and using result of calculation as the text Characteristic vector corresponding to notebook data.
- 4. according to the method for claim 1, it is characterised in that inputted using the characteristic vector as feature to pre-established Disaggregated model before, in addition to:Obtain the user behavior data associated in the social network data with the text data;Feature selecting processing, characteristic variable corresponding to acquisition are carried out to the user behavior data;Wherein, the characteristic vector is inputted to pre-established disaggregated model as feature includes:The characteristic vector and the characteristic variable are inputted to pre-established disaggregated model as feature.
- 5. according to the method for claim 4, it is characterised in that described that the user behavior data is carried out at feature selecting Reason, obtaining correlated variables includes:Determine the variable in the user behavior data;The variable is scored based on predetermined Method for Feature Selection, to determine the variable to corresponding to the text data The disturbance degree of event;The variable that disturbance degree meets predetermined condition is chosen from the variable in the user behavior data, as characteristic variable.
- 6. according to the method for claim 5, it is characterised in that the predetermined Method for Feature Selection is filtering type feature selecting It is at least one in method, packaging type Method for Feature Selection, integrated form Method for Feature Selection.
- 7. according to the method for claim 6, it is characterised in that the user behavior data includes:Solid data and/or mark Data are signed, the solid data is used to represent the set of the data related to text data, and the label data is used to represent text Data corresponding to label corresponding to word and label in notebook data or text data.
- 8. according to the method for claim 1, it is characterised in that obtain disaggregated model output prediction result it Afterwards, in addition to:The suspicious probability of the entity related to the text data is determined according to the prediction result.
- 9. according to the method described in claim any one of 1-8, it is characterised in that inputted using the characteristic vector as feature Before to pre-established disaggregated model, in addition to:Sample data is obtained, the sample data includes:Sample event, and text data corresponding to the sample event and/ Or user behavior data;Vectorization processing is carried out to the text data, obtains characteristic vector corresponding to the text data;And/or to described User behavior data carries out feature selecting processing, obtains characteristic variable corresponding with the user behavior data;Characterized by characteristic vector corresponding to sample event and/or characteristic variable, disaggregated model is established.
- A kind of 10. event prediction device, it is characterised in that including:First acquisition unit, for obtaining the text data in social network data;First processing units, for carrying out vectorization processing to the text data, obtain feature corresponding to the text data Vector;Second processing unit, for the characteristic vector to be inputted to pre-established disaggregated model, the disaggregated model be used for The characteristic vector of the text data is characterized the probability for determining that the text data corresponds to suspicious event.
- A kind of 11. event prediction system, it is characterised in that including:Data warehouse, kafka clusters and storm clusters, wherein:The data warehouse, for storing social network data, and the producer for the kafka clusters provides social networks number According to;The kafka clusters, for being pre-processed to the social network data, to extract in the social network data Text data and/or user behavior data;The storm clusters, for calling the event prediction device described in claim 10, to consume in the kafka clusters Text data and/or user behavior data, output corresponding to suspicious event probability.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711064205.6A CN107622333B (en) | 2017-11-02 | 2017-11-02 | Event prediction method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711064205.6A CN107622333B (en) | 2017-11-02 | 2017-11-02 | Event prediction method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107622333A true CN107622333A (en) | 2018-01-23 |
CN107622333B CN107622333B (en) | 2020-08-18 |
Family
ID=61092921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711064205.6A Active CN107622333B (en) | 2017-11-02 | 2017-11-02 | Event prediction method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107622333B (en) |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108182279A (en) * | 2018-01-26 | 2018-06-19 | 有米科技股份有限公司 | Object classification method, device and computer equipment based on text feature |
CN108932530A (en) * | 2018-06-29 | 2018-12-04 | 新华三大数据技术有限公司 | The construction method and device of label system |
CN108960291A (en) * | 2018-06-08 | 2018-12-07 | 武汉科技大学 | A kind of image processing method and system based on parallelization Softmax classification |
CN109409529A (en) * | 2018-09-13 | 2019-03-01 | 北京中科闻歌科技股份有限公司 | A kind of event cognitive analysis method, system and storage medium |
CN109543153A (en) * | 2018-11-13 | 2019-03-29 | 成都数联铭品科技有限公司 | A kind of sequence labelling system and method |
CN109614541A (en) * | 2018-12-04 | 2019-04-12 | 北京艾漫数据科技股份有限公司 | A kind of event recognition method, medium, device and calculate equipment |
CN109766429A (en) * | 2019-02-19 | 2019-05-17 | 北京奇艺世纪科技有限公司 | A kind of sentence retrieval method and device |
CN109815415A (en) * | 2019-01-23 | 2019-05-28 | 四川易诚智讯科技有限公司 | Social media user interest recognition methods based on card side's word frequency analysis |
CN109871889A (en) * | 2019-01-31 | 2019-06-11 | 内蒙古工业大学 | Mass psychology appraisal procedure under emergency event |
CN110162558A (en) * | 2019-04-01 | 2019-08-23 | 阿里巴巴集团控股有限公司 | Structural data processing method and processing device |
CN110210559A (en) * | 2019-05-31 | 2019-09-06 | 北京小米移动软件有限公司 | Object screening technique and device, storage medium |
CN110491145A (en) * | 2018-10-29 | 2019-11-22 | 魏天舒 | A kind of traffic signal optimization control method and device |
WO2020063071A1 (en) * | 2018-09-27 | 2020-04-02 | 厦门快商通信息技术有限公司 | Sentence vector calculation method based on chi-square test, and text classification method and system |
CN111046179A (en) * | 2019-12-03 | 2020-04-21 | 哈尔滨工程大学 | Text classification method for open network question in specific field |
CN111159166A (en) * | 2019-12-27 | 2020-05-15 | 沃民高新科技(北京)股份有限公司 | Event prediction method and device, storage medium and processor |
WO2020124026A1 (en) * | 2018-12-13 | 2020-06-18 | SparkCognition, Inc. | Security systems and methods |
CN111459959A (en) * | 2020-03-31 | 2020-07-28 | 北京百度网讯科技有限公司 | Method and apparatus for updating event set |
CN111477328A (en) * | 2020-03-31 | 2020-07-31 | 北京智能工场科技有限公司 | Non-contact psychological state prediction method |
CN111626783A (en) * | 2020-04-30 | 2020-09-04 | 贝壳技术有限公司 | Offline information setting method and device for realizing event conversion probability prediction |
CN111770097A (en) * | 2020-06-29 | 2020-10-13 | 中国科学院计算技术研究所 | Content lock firewall method and system based on white list |
CN112101950A (en) * | 2020-09-27 | 2020-12-18 | 中国建设银行股份有限公司 | Suspicious transaction monitoring model feature extraction method and device |
CN112233381A (en) * | 2020-10-14 | 2021-01-15 | 中国科学院、水利部成都山地灾害与环境研究所 | Debris flow early warning method and system based on mechanism and machine learning coupling |
CN112487406A (en) * | 2020-12-02 | 2021-03-12 | 中国电子科技集团公司第三十研究所 | Network behavior analysis method based on machine learning |
CN113190682A (en) * | 2021-06-30 | 2021-07-30 | 平安科技(深圳)有限公司 | Method and device for acquiring event influence degree based on tree model and computer equipment |
CN114169325A (en) * | 2021-11-30 | 2022-03-11 | 西安理工大学 | Web page new word discovering and analyzing method based on word vector representation |
CN114707685A (en) * | 2021-12-17 | 2022-07-05 | 武汉烽火众智智慧之星科技有限公司 | Event prediction method and device based on big data modeling analysis |
CN114169325B (en) * | 2021-11-30 | 2024-09-27 | 西安理工大学 | Webpage new word discovery and analysis method based on word vector representation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103116605A (en) * | 2013-01-17 | 2013-05-22 | 上海交通大学 | Method and system of microblog hot events real-time detection based on detection subnet |
CN103853841A (en) * | 2014-03-19 | 2014-06-11 | 北京邮电大学 | Method for analyzing abnormal behavior of user in social networking site |
CN104281607A (en) * | 2013-07-08 | 2015-01-14 | 上海锐英软件技术有限公司 | Microblog hot topic analyzing method |
CN107169629A (en) * | 2017-04-17 | 2017-09-15 | 四川九洲电器集团有限责任公司 | A kind of telecommunication fraud recognition methods and data processing equipment |
-
2017
- 2017-11-02 CN CN201711064205.6A patent/CN107622333B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103116605A (en) * | 2013-01-17 | 2013-05-22 | 上海交通大学 | Method and system of microblog hot events real-time detection based on detection subnet |
CN104281607A (en) * | 2013-07-08 | 2015-01-14 | 上海锐英软件技术有限公司 | Microblog hot topic analyzing method |
CN103853841A (en) * | 2014-03-19 | 2014-06-11 | 北京邮电大学 | Method for analyzing abnormal behavior of user in social networking site |
CN107169629A (en) * | 2017-04-17 | 2017-09-15 | 四川九洲电器集团有限责任公司 | A kind of telecommunication fraud recognition methods and data processing equipment |
Non-Patent Citations (1)
Title |
---|
董坚峰: ""面向公共危机预警的网络舆情分析研究"", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108182279A (en) * | 2018-01-26 | 2018-06-19 | 有米科技股份有限公司 | Object classification method, device and computer equipment based on text feature |
CN108960291A (en) * | 2018-06-08 | 2018-12-07 | 武汉科技大学 | A kind of image processing method and system based on parallelization Softmax classification |
CN108932530A (en) * | 2018-06-29 | 2018-12-04 | 新华三大数据技术有限公司 | The construction method and device of label system |
CN109409529A (en) * | 2018-09-13 | 2019-03-01 | 北京中科闻歌科技股份有限公司 | A kind of event cognitive analysis method, system and storage medium |
CN109409529B (en) * | 2018-09-13 | 2020-12-08 | 北京中科闻歌科技股份有限公司 | Event cognitive analysis method, system and storage medium |
WO2020063071A1 (en) * | 2018-09-27 | 2020-04-02 | 厦门快商通信息技术有限公司 | Sentence vector calculation method based on chi-square test, and text classification method and system |
CN110491145A (en) * | 2018-10-29 | 2019-11-22 | 魏天舒 | A kind of traffic signal optimization control method and device |
CN109543153A (en) * | 2018-11-13 | 2019-03-29 | 成都数联铭品科技有限公司 | A kind of sequence labelling system and method |
CN109543153B (en) * | 2018-11-13 | 2023-08-18 | 成都数联铭品科技有限公司 | Sequence labeling system and method |
CN109614541A (en) * | 2018-12-04 | 2019-04-12 | 北京艾漫数据科技股份有限公司 | A kind of event recognition method, medium, device and calculate equipment |
WO2020124026A1 (en) * | 2018-12-13 | 2020-06-18 | SparkCognition, Inc. | Security systems and methods |
GB2595088A (en) * | 2018-12-13 | 2021-11-17 | Sparkcognition Inc | Security systems and methods |
CN109815415A (en) * | 2019-01-23 | 2019-05-28 | 四川易诚智讯科技有限公司 | Social media user interest recognition methods based on card side's word frequency analysis |
CN109871889A (en) * | 2019-01-31 | 2019-06-11 | 内蒙古工业大学 | Mass psychology appraisal procedure under emergency event |
CN109871889B (en) * | 2019-01-31 | 2019-12-24 | 内蒙古工业大学 | Public psychological assessment method under emergency |
CN109766429A (en) * | 2019-02-19 | 2019-05-17 | 北京奇艺世纪科技有限公司 | A kind of sentence retrieval method and device |
CN110162558A (en) * | 2019-04-01 | 2019-08-23 | 阿里巴巴集团控股有限公司 | Structural data processing method and processing device |
CN110210559A (en) * | 2019-05-31 | 2019-09-06 | 北京小米移动软件有限公司 | Object screening technique and device, storage medium |
CN110210559B (en) * | 2019-05-31 | 2021-10-08 | 北京小米移动软件有限公司 | Object screening method and device and storage medium |
CN111046179B (en) * | 2019-12-03 | 2022-07-15 | 哈尔滨工程大学 | Text classification method for open network question in specific field |
CN111046179A (en) * | 2019-12-03 | 2020-04-21 | 哈尔滨工程大学 | Text classification method for open network question in specific field |
CN111159166A (en) * | 2019-12-27 | 2020-05-15 | 沃民高新科技(北京)股份有限公司 | Event prediction method and device, storage medium and processor |
CN111477328A (en) * | 2020-03-31 | 2020-07-31 | 北京智能工场科技有限公司 | Non-contact psychological state prediction method |
CN111459959A (en) * | 2020-03-31 | 2020-07-28 | 北京百度网讯科技有限公司 | Method and apparatus for updating event set |
CN111626783B (en) * | 2020-04-30 | 2021-08-31 | 贝壳找房(北京)科技有限公司 | Offline information setting method and device for realizing event conversion probability prediction |
CN111626783A (en) * | 2020-04-30 | 2020-09-04 | 贝壳技术有限公司 | Offline information setting method and device for realizing event conversion probability prediction |
CN111770097A (en) * | 2020-06-29 | 2020-10-13 | 中国科学院计算技术研究所 | Content lock firewall method and system based on white list |
CN111770097B (en) * | 2020-06-29 | 2021-04-23 | 中国科学院计算技术研究所 | Content lock firewall method and system based on white list |
CN112101950A (en) * | 2020-09-27 | 2020-12-18 | 中国建设银行股份有限公司 | Suspicious transaction monitoring model feature extraction method and device |
CN112101950B (en) * | 2020-09-27 | 2024-05-10 | 中国建设银行股份有限公司 | Suspicious transaction monitoring model feature extraction method and suspicious transaction monitoring model feature extraction device |
CN112233381A (en) * | 2020-10-14 | 2021-01-15 | 中国科学院、水利部成都山地灾害与环境研究所 | Debris flow early warning method and system based on mechanism and machine learning coupling |
CN112487406A (en) * | 2020-12-02 | 2021-03-12 | 中国电子科技集团公司第三十研究所 | Network behavior analysis method based on machine learning |
CN113190682A (en) * | 2021-06-30 | 2021-07-30 | 平安科技(深圳)有限公司 | Method and device for acquiring event influence degree based on tree model and computer equipment |
CN114169325A (en) * | 2021-11-30 | 2022-03-11 | 西安理工大学 | Web page new word discovering and analyzing method based on word vector representation |
CN114169325B (en) * | 2021-11-30 | 2024-09-27 | 西安理工大学 | Webpage new word discovery and analysis method based on word vector representation |
CN114707685A (en) * | 2021-12-17 | 2022-07-05 | 武汉烽火众智智慧之星科技有限公司 | Event prediction method and device based on big data modeling analysis |
Also Published As
Publication number | Publication date |
---|---|
CN107622333B (en) | 2020-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107622333A (en) | A kind of event prediction method, apparatus and system | |
Banerjee et al. | Detection of cyberbullying using deep neural network | |
CN112221156B (en) | Data abnormality recognition method, data abnormality recognition device, storage medium, and electronic device | |
Nagamanjula et al. | A novel framework based on bi-objective optimization and LAN2FIS for Twitter sentiment analysis | |
ALRashdi et al. | Deep learning and word embeddings for tweet classification for crisis response | |
CN108920654A (en) | A kind of matched method and apparatus of question and answer text semantic | |
US11200381B2 (en) | Social content risk identification | |
CN108961032A (en) | Borrow or lend money processing method, device and server | |
CN113761359B (en) | Data packet recommendation method, device, electronic equipment and storage medium | |
Bogle et al. | SentAMaL-a sentiment analysis machine learning stock predictive model | |
CN113139052B (en) | Rumor detection method and device based on graph neural network feature aggregation | |
CN107392311A (en) | The method and apparatus of sequence cutting | |
Kumar et al. | Content based bot detection using bot language model and bert embeddings | |
Hossain et al. | A study towards Bangla fake news detection using machine learning and deep learning | |
Lin et al. | Social rumor detection based on multilayer transformer encoding blocks | |
Rama et al. | Deep learning to address candidate generation and cold start challenges in recommender systems: A research survey | |
CN111611409B (en) | Case analysis method integrated with scene knowledge and related equipment | |
CN116484105B (en) | Service processing method, device, computer equipment, storage medium and program product | |
Ali et al. | Identifying and Profiling User Interest over time using Social Data | |
Goldani et al. | X-CapsNet For Fake News Detection | |
Dong et al. | Rumor Detection with Adversarial Training and Supervised Contrastive Learning | |
Narayan et al. | Fake news detection using hybrid of deep neural network and stacked lstm | |
Siddiqui et al. | An ensemble approach for the identification and classification of crime tweets in the English language | |
CN116860952B (en) | RPA intelligent response processing method and system based on artificial intelligence | |
Sree et al. | Implementation of Text-Based Sentiment Analysis Using LSTM Model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 100081 101 / F, building 14, 27 Jiancai Middle Road, Haidian District, Beijing Patentee after: Beijing PERCENT Technology Group Co.,Ltd. Address before: 100081 16 / F, block a, Beichen Century Center, building 2, courtyard 8, Beichen West Road, Chaoyang District, Beijing Patentee before: BEIJING BAIFENDIAN INFORMATION SCIENCE & TECHNOLOGY Co.,Ltd. |