CN116975300A - Information mining method and system based on big data set - Google Patents

Information mining method and system based on big data set Download PDF

Info

Publication number
CN116975300A
CN116975300A CN202311228862.5A CN202311228862A CN116975300A CN 116975300 A CN116975300 A CN 116975300A CN 202311228862 A CN202311228862 A CN 202311228862A CN 116975300 A CN116975300 A CN 116975300A
Authority
CN
China
Prior art keywords
behavior
network
network behavior
features
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311228862.5A
Other languages
Chinese (zh)
Other versions
CN116975300B (en
Inventor
张占峰
于东志
赵斌
王玉廷
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Tower Co ltd Jilin Branch
Original Assignee
China Tower Co ltd Jilin Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Tower Co ltd Jilin Branch filed Critical China Tower Co ltd Jilin Branch
Priority to CN202311228862.5A priority Critical patent/CN116975300B/en
Publication of CN116975300A publication Critical patent/CN116975300A/en
Application granted granted Critical
Publication of CN116975300B publication Critical patent/CN116975300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an information mining method and system based on a big data set, and relates to the technical field of artificial intelligence. In the invention, a first number of network behavior combinations and a first number of behavior category identification combinations are extracted from a network big data set; analyzing a first number of behavior relation identifiers of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior type features, the first number of second network behavior features and the first number of second network behavior type features; if the network behavior information of the target network user is changed according to a group of behavior relation identifiers in the first number of behavior relation identifiers, corresponding network behavior information change data is generated. Based on the above, the efficiency of network behavior information mining can be improved.

Description

Information mining method and system based on big data set
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an information mining method and system based on a big data set.
Background
In the current digital age, the mining and analysis of network behavior information has important significance for individuals, organizations and society. With the increasing number and activity of internet users, vast amounts of network behavior information is related to various forms of data. However, it is becoming increasingly challenging to efficiently extract valuable information from such data and perform accurate analysis. Although the maturity and application of artificial intelligence technology gradually improves the mining precision of network behavior information, there is a problem that the mining efficiency is relatively low because the data volume of network behavior information is generally huge.
Disclosure of Invention
In view of the above, the present invention aims to provide an information mining method and system based on big data set, so as to improve the efficiency of network behavior information mining.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical scheme:
an information mining method based on big data sets, comprising:
extracting a first number of network behavior combinations and a first number of behavior type identification combinations from a network big data set, wherein each network behavior combination in the first number of network behavior combinations comprises a first network behavior and a second network behavior which need to identify behavior relation identifications, each first network behavior in the first number of network behavior combinations and each second network behavior type identification in the first number of network behavior combinations, each network behavior combination comprises a first network behavior distributed in the front and a second network behavior distributed in the back, each first network behavior belongs to one network behavior in the network big data set, and each network behavior in the network big data set is obtained by carrying out network monitoring on a target network user;
Analyzing a first number of behavior relation identifications of the first number of network behavior combinations based on a first number of first network behavior features, a first number of first network behavior type features, a first number of second network behavior features, and a first number of second network behavior type features, wherein the first number of first network behavior features are used for characterizing a first number of first network behaviors in the first number of network behavior combinations, the first number of first network behavior type features are used for characterizing behavior type identifications of the first number of first network behaviors, the first number of second network behavior features are used for characterizing a first number of second network behaviors in the first number of network behavior combinations, and the first number of second network behavior type features are used for characterizing behavior type identifications of the first number of second network behaviors;
and if the network behavior information of the target network user changes according to the group of behavior relation identifications in the first number of behavior relation identifications, generating network behavior information change data of the target network user.
In some preferred embodiments, in the big data set-based information mining method, the step of extracting a first number of network behavior combinations and a first number of behavior category identification combinations in the network big data set includes:
Extracting a second number of network behaviors and a second number of behavior type identifiers which are in one-to-one correspondence from a network big data set, wherein the second number of behavior type identifiers comprise behavior type identifiers of each network behavior in the second number of network behaviors, and each network behavior in the second number of network behaviors belongs to one network behavior in the network big data set;
combining a first number of paired two network behaviors in the second number of network behaviors to form a first number of network behavior combinations, wherein each network behavior combination in the first number of network behavior combinations comprises two paired network behaviors in the second number of network behaviors, the network behaviors distributed in the front in the two paired network behaviors serve as a first network behavior, and the network behaviors distributed in the front serve as a second network behavior;
and extracting the behavior type identifications of the first network behaviors and the second network behaviors included in each network behavior combination in the first number of network behavior combinations from the second number of behavior type identifications to form a first number of behavior type identification combinations.
In some preferred embodiments, in the big data set based information mining method, the step of analyzing the first number of behavior relation identifications of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior category features includes:
mining network behavior characteristics of a first number of first network behaviors in the first number of network behavior combinations to obtain first number of first network behavior characteristics, and mining network behavior characteristics of a first number of second network behaviors in the first number of network behavior combinations to obtain first number of second network behavior characteristics;
digging out the category identification characteristics of the category identification of the first number of first network behaviors in the first number of behavior category identification combinations to obtain a first number of first network behavior category characteristics, and digging out the category identification characteristics of the category identification of the first number of second network behaviors in the first number of behavior category identification combinations to obtain a first number of second network behavior category characteristics;
Aggregating the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior features, and the first number of second network behavior category features to form a first number of network behavior combination features corresponding to the first number of network behavior combinations;
based on the first number of network behavior combination characteristics, a first number of behavior relation identifications of the first number of network behavior combinations are analyzed, and each behavior relation identification in the first number of behavior relation identifications is used for reflecting behavior relation content between the first network behavior and the second network behavior in a corresponding network behavior combination in the first number of network behavior combinations.
In some preferred embodiments, in the big data set based information mining method, each behavior relationship identifier in the first number of behavior relationship identifiers reflects one of a third number of predetermined behavior relationships, where the third number of behavior relationships includes a fourth number of behavior association relationships and no relationship, and a difference between the third number and the fourth number is equal to 1.
In some preferred embodiments, in the foregoing big data set based information mining method, the step of mining out the network behavior features of the first number of first network behaviors in the first number of network behavior combinations to obtain first number of first network behavior features, and mining out the network behavior features of the first number of second network behaviors in the first number of network behavior combinations to obtain first number of second network behavior features includes:
mining network behavior characteristics of an a-th first network behavior in the first number of first network behaviors based on the following operation to obtain the a-th first network behavior characteristics, wherein the a-th first network behavior belongs to one of the first number of first network behaviors:
if the a-th first network behavior comprises a fifth number of behavior description words, determining word embedding characteristics of each behavior description word in the fifth number of behavior description words to obtain a fifth number of word embedding characteristics;
fusing the embedded features of the fifth number of words and outputting a first network behavior feature;
and mining out the network behavior characteristics of the a-th second network behavior in the first number of second network behaviors based on the following operation, so as to obtain the a-th second network behavior characteristics, wherein the a-th second network behavior belongs to one of the first number of second network behaviors:
If the a second network behavior comprises a sixth number of behavior description words, determining word embedding characteristics of each behavior description word in the sixth number of behavior description words to obtain a sixth number of word embedding characteristics;
and merging the embedded features of the sixth number of words and outputting a second network behavior feature.
In some preferred embodiments, in the foregoing big data set based information mining method, the step of mining out category identification features of the category identification of the first number of first network behaviors in the first number of behavior category identification combinations to obtain a first number of first network behavior category features, and mining out category identification features of the category identification of the first number of second network behaviors in the first number of behavior category identification combinations to obtain a first number of second network behavior category features includes:
mining the category identification feature of the behavior category identification of the a-th first network behavior in the first number of first network behaviors to obtain the a-th first network behavior category feature, wherein the a-th first network behavior belongs to one of the first number of first network behaviors:
If the behavior type identifier of the a-th first network behavior comprises an a-th behavior type description text, determining word embedding characteristics of the a-th behavior type description text, and determining the a-th first network behavior type characteristics as the word embedding characteristics of the a-th behavior type description text;
and mining the category identification feature of the behavior category identification of the a-th second network behavior in the first number of second network behaviors based on the following operation to obtain the a-th second network behavior category feature, wherein the a-th second network behavior belongs to one of the first number of second network behaviors:
and if the behavior type identifier of the a second network behavior comprises an a-th behavior type description text, determining word embedding characteristics of the a-th behavior type description text, and determining the a-th second network behavior type characteristics as the word embedding characteristics of the a-th behavior type description text.
In some preferred embodiments, in the foregoing big data set based information mining method, the step of aggregating the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior category features, and forming a first number of network behavior combination features corresponding to the first number of network behavior combinations includes:
Forming an a-th network behavior combination feature corresponding to an a-th network behavior combination in the first number of network behavior combinations based on aggregation of an a-th first network behavior and an a-th second network behavior, the a-th network behavior combination belonging to one of the first number of network behavior combinations:
the method comprises the steps of aggregating a first network behavior feature, a first network behavior type feature, a second network behavior feature and a second network behavior type feature to obtain a network behavior combination feature, wherein the a first network behavior feature is used for representing the a first network behavior, the a first network behavior type feature is used for representing behavior type identification of the a first network behavior, the a second network behavior feature is used for representing the a second network behavior, and the a second network behavior type feature is used for representing behavior type identification of the a second network behavior.
In some preferred embodiments, in the big data set based information mining method, the step of aggregating the a-th first network behavior feature, the a-th first network behavior category feature, the a-th second network behavior category feature, and obtaining the a-th network behavior combination feature includes:
Determining a first set of first coordinate embedding features for characterizing a first set of coordinates of the network big data set;
and aggregating the first coordinate embedded feature, the a first network behavior type feature, the a second network behavior feature and the a second network behavior type feature in the direction of feature dimension to form an a network behavior combination feature, wherein the network behaviors in the first number of network behavior combinations are network behaviors extracted from the network big data set.
In some preferred embodiments, in the big data set based information mining method, the step of analyzing a first number of behavior relation identifications of the first number of network behavior combinations based on the first number of network behavior combination features includes:
analyzing an a-th behavior relation identification of an a-th network behavior combination of the first number of network behavior combinations according to an a-th network behavior combination feature of the first number of network behavior combination features, wherein the a-th network behavior combination feature belongs to one of the first number of network behavior combination features, based on the following operations:
Loading the a-th network behavior combination characteristic into a behavior relation identification network, and identifying a third quantity of behavior relation identification data corresponding to a third quantity of behavior relation identifiers, wherein the third quantity of behavior relation identification data is used for reflecting the possibility that the a-th behavior relation identifier is each behavior relation identifier in a third quantity of predetermined behavior relation identifiers;
marking the a-th behavior relation identifier as a target behavior relation identifier in the third number of predetermined behavior relation identifiers, wherein the a-th behavior relation identifier is the highest possibility of the target behavior relation identifier in the third number of behavior relation identification data.
The embodiment of the invention also provides an information mining system based on the big data set, which comprises a processor and a memory, wherein the memory is used for storing a computer program, and the processor is used for executing the computer program so as to realize the information mining method based on the big data set.
According to the information mining method and system based on the big data set, a first number of network behavior combinations and a first number of behavior type identification combinations in the network big data set are extracted; analyzing a first number of behavior relation identifiers of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior type features, the first number of second network behavior features and the first number of second network behavior type features; if the network behavior information of the target network user is changed according to a group of behavior relation identifiers in the first number of behavior relation identifiers, corresponding network behavior information change data is generated. Based on the foregoing, since the analysis and recognition of the behavior relationship are performed on the network behavior combination, the data size or complexity of the object to be analyzed and recognized is relatively reduced, so that the efficiency of mining the network behavior information can be improved to a certain extent, and the problem of relatively low mining efficiency in the prior art is solved.
In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
Fig. 1 is a block diagram of a big data set-based information mining system according to an embodiment of the present invention.
Fig. 2 is a flowchart illustrating steps included in the big data set-based information mining method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of each module included in the big data set based information mining apparatus according to the embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, an embodiment of the present invention provides an information mining system based on a large data set. Wherein the big data set based information mining system may include a memory and a processor.
Optionally, in some embodiments, the memory and the processor are electrically connected directly or indirectly to enable transmission or interaction of data. For example, electrical connection may be made to each other via one or more communication buses or signal lines. The memory may store at least one software functional module (computer program) that may exist in the form of software or firmware. The processor may be configured to execute the executable computer program stored in the memory, so as to implement the big data set-based information mining method provided by the embodiment of the present invention.
Alternatively, in some embodiments, the Memory may be, but is not limited to, random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), and the like.
Alternatively, in some embodiments, the processor may be a general purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), a System on Chip (SoC), etc.; but also Digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
Alternatively, in some embodiments, the large data set based information mining system may be a server with data processing capabilities.
With reference to fig. 2, the embodiment of the invention further provides an information mining method based on a big data set, which can be applied to the information mining system based on the big data set. The method steps defined by the flow related to the big data set-based information mining method can be realized by the big data set-based information mining system. The specific flow shown in fig. 2 will be described in detail.
Step S110, a first number of network behavior combinations and a first number of behavior category identification combinations in the network big data set are extracted.
In the embodiment of the invention, the information mining system based on the big data set can extract a first number of network behavior combinations and a first number of behavior category identification combinations in the network big data set. Each network behavior combination in the first number of network behavior combinations comprises a first network behavior and a second network behavior which need to identify behavior relation identifications, the first number of behavior type identification combinations comprise behavior type identifications of each first network behavior and each second network behavior in the first number of network behavior combinations, each network behavior combination comprises a first network behavior distributed in front and a second network behavior distributed in back, the first network behavior belongs to one network behavior in a network big data set, the second network behavior belongs to one network behavior in the network big data set, each network behavior included in the network big data set is obtained by carrying out network monitoring on a target network user, for example, the network behaviors in the last period (such as a week, a month, a quarter or a year) of the target network user can be counted, and the network behaviors are ordered according to the time sequence of occurrence, so as to form the network big data set. In addition, the counted network behaviors are all the behaviors which can be collected by the authorization of the target network user, and the unauthorized network behaviors are not in the statistical range.
Step S120, analyzing a first number of behavior relation identifiers of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior features, and the first number of second network behavior category features.
In the embodiment of the invention, the big data set-based information mining system may analyze a first number of behavior relation identifications of the first number of network behavior combinations based on a first number of first network behavior features, a first number of first network behavior category features, a first number of second network behavior features, and a first number of second network behavior category features. The first number of first network behavior features are used for characterizing a first number of first network behaviors (such as feature mining of the first network behaviors), the first number of first network behavior type features are used for characterizing behavior type identifications of the first number of first network behaviors (such as feature mining of the behavior type identifications), the first number of second network behavior features are used for characterizing a first number of second network behaviors (such as feature mining of the second network behaviors) in the first number of network behavior combinations, and the first number of second network behavior type features are used for characterizing behavior type identifications of the first number of second network behaviors (such as feature mining of the behavior type identifications).
Step S130, if the network behavior information of the target network user changes according to the set of behavior relation identifiers in the first number of behavior relation identifiers, network behavior information change data of the target network user is generated.
In the embodiment of the present invention, the big data set-based information mining system may generate the network behavior information variation data of the target network user when the network behavior information of the target network user is reflected according to a set of behavior relationship identifiers in the first number of behavior relationship identifiers. The network behavior information change data may refer to a change in network behavior tendency, for example, network behavior a has a tendency of a behavior, network behavior B has a tendency of B behavior, and if a first number of network behavior combinations have network behavior combinations of network behavior a and network behavior B, a corresponding set of behavior relation identifiers reflects that the condition that the network behavior information of the target network user changes is "the tendency of a behavior changes to the tendency of B behavior"; that is, the network behavior information change data of the target network user may be "a behavior tendency change to B behavior tendency", and may further include information such as time and duration when "a behavior tendency change to B behavior tendency".
For example:
assume that a large data set is provided that contains network behavior data of the user. A number of network behavior combinations and corresponding behavior class identification combinations may be extracted from the data set. For example, the first 100 network behavior combinations may be selected, each consisting of two network behaviors, and behavior class identifications for each of these combinations are extracted. Based on the extracted network behavior combinations and behavior category identifications, an analysis may be performed to derive behavior relationship identifications. By analyzing the first network behavior feature, the first network behavior category feature, the second network behavior feature, and the second network behavior category feature in each network behavior combination, a corresponding behavior relationship identification can be derived. For example, if a first network action in a certain network action combination is publishing network media data-browsing related network media data (i.e., a network action may include a plurality of network actions or network operations, and in particular, whether it belongs to an action may be determined based on continuity between actions or operations), the corresponding action category identification may be "publish-browse", and a second network action is publishing network media data-exit platform, and the corresponding action category identification may be "publish-exit". As such, the corresponding behavior relationship identification may be "information concerning release-information not concerning release" or "persistent behavior-non-persistent behavior". That is, the behavior relation identifier can reflect the variation of the behavior tendency of the user to the published information, and based on the variation, the user can also be subjected to corresponding information management and control, for example, before the user can be subjected to pushing of related information, and after that, the user can not be subjected to pushing of related information, so that the problem of resource waste caused by pushing of related information can be reduced to a certain extent. Otherwise, if the corresponding behavior relation is identified as continuous behavior-continuous behavior, the behavior tendency is not changed; if the corresponding behavior relation is identified as "non-persistent behavior", the behavior tendency is changed, and the pushing of the related information can be changed from the previous pushing of the related information, and the like.
Based on the foregoing, that is, the foregoing steps S110 to S130, since the analysis and recognition of the behavior relationship are performed on the network behavior combination, the data size or complexity of the object to be analyzed and recognized is relatively reduced, so that the efficiency of mining network behavior information can be improved to a certain extent, and the problem that the mining efficiency in the prior art is relatively low is solved.
Optionally, in some embodiments, the step S110 may further include the following specific implementable contents:
extracting a second number of network behaviors and a second number of behavior type identifications corresponding to each other from the network big data set, wherein the second number of behavior type identifications comprises a behavior type identification of each network behavior in the second number of network behaviors, each network behavior in the second number of network behaviors belongs to one network behavior in the network big data set, and the network behavior 1 corresponds to a behavior type identification 1, the network behavior 2 corresponds to a behavior type identification 2, the network behavior 3 corresponds to a behavior type identification 3 and the like;
combining a first number of paired two network actions in the second number of network actions to form a first number of network action combinations, wherein each network action combination in the first number of network action combinations comprises two paired network actions in the second number of network actions, the network actions distributed in front in the two paired network actions serve as a first network action, and the network actions distributed in front serve as a second network action, for example, network action 1 and network action 2 can form a network action combination 1, network action 1 belongs to the first network action in the network action combination 1, network action 2 belongs to the second network action, and network action 1 and network action 3 can also form a network action combination 2, network action 1 belongs to the first network action in the network action combination 2, and network action 3 belongs to the second network action;
From the second number of behavior type identifications, the behavior type identifications of the first network behavior and the second network behavior included in each of the first number of network behavior combinations are extracted to form a first number of behavior type identification combinations, and then the foregoing example, network behavior combination 1 corresponds to behavior type identification combination 1, behavior type identification combination 1 includes behavior type identification 1 of network behavior 1 and behavior type identification 2 of network behavior 2, network behavior combination 2 corresponds to behavior type identification combination 2, and behavior type identification combination 2 includes behavior type identification 1 of network behavior 1 and behavior type identification 3 of network behavior 3.
For example:
assume that the following network behavior and behavior class identification are selected from the network big data set:
network behavior data:
publishing network media data-browsing related network media data;
praying the network media data-sharing the network media data;
comment network media data-reply comment;
behavior type identification:
publishing-browsing;
praise-share;
comment-reply;
combining two network behaviors of one pair of the second number of network behaviors to form a first number of network behavior combinations:
Based on the above network behavior data, the following network behavior combinations may be formed:
network behavior combination 1: publishing network media data-browsing related network media data VS endorsing network media data-sharing network media data;
network behavior combination 2: publishing network media data-browsing related network media data VS comment network media data-reply comment;
in the network behavior combination 1, network media data-browsing related network media data is published as a first network behavior, and network media data-sharing network media data is praised as a second network behavior. In the network behavior combination 2, the network media data-browsing related network media data is posted as a first network behavior, and the network media data-reply comment is reviewed as a second network behavior.
Extracting behavior type identifiers of network behaviors included in each network behavior combination in the first number of network behavior combinations from the second number of behavior type identifiers to form a first number of behavior type identifier combinations:
based on the behavior category identification described above, the following behavior category identification combinations may be formed:
behavior category identification combination 1: publishing-browsing VS praise-sharing;
Behavior category identification combination 2: publishing-browsing VS comment-reply;
in the network behavior combination 1, publishing network media data-browsing related network media data corresponds to publishing-browsing, and endorsing network media data-sharing network media data corresponds to endorsing-sharing. In the network behavior combination 2, posting network media data-browsing related network media data corresponds to posting-browsing, and commenting network media data-replying commentary corresponds to commentary-replying.
Optionally, in some embodiments, the step S120 may further include the following specific implementable contents:
mining out network behavior characteristics of a first number of first network behaviors in the first number of network behavior combinations to obtain first number of first network behavior characteristics, and mining out network behavior characteristics of a first number of second network behaviors in the first number of network behavior combinations to obtain first number of second network behavior characteristics, for example, a first network behavior 1 corresponds to a first network behavior characteristic 1, and a first network behavior 2 corresponds to a first network behavior characteristic 2;
mining out category identification features of the category identification of the first number of first network behaviors in the first number of behavior category identification combinations to obtain a first number of first network behavior category features, and mining out category identification features of the category identification of the first number of second network behaviors in the first number of behavior category identification combinations to obtain a first number of second network behavior category features, for example, the category identification of the first network behavior 1 corresponds to the first network behavior category feature 1, and the category identification of the first network behavior 2 corresponds to the first network behavior category feature 2;
The first number of first network behavior features, the first number of first network behavior type features, the first number of second network behavior features and the first number of second network behavior type features are aggregated to form a first number of network behavior combination features corresponding to the first number of network behavior combinations, for example, the first network behavior features, the first network behavior type features, the second network behavior features and the second network behavior type features corresponding to the network behavior combination 1 can be aggregated to obtain a network behavior combination feature 1, and the first network behavior features, the first network behavior type features, the second network behavior features and the second network behavior type features corresponding to the network behavior combination 2 are aggregated to obtain a network behavior combination feature 2, so that the network behavior combination features can carry more information and have rich semantics;
analyzing a first number of behavior relation identifications of the first number of network behavior combinations based on the first number of network behavior combination characteristics, wherein each behavior relation identification of the first number of behavior relation identifications is used for reflecting behavior relation content between the first network behavior and the second network behavior in a corresponding one of the first number of network behavior combinations, for example, analyzing a behavior relation identification 1 corresponding to the network behavior combination 1 based on the network behavior combination characteristics 1, and analyzing a behavior relation identification 2 corresponding to the network behavior combination 2 based on the network behavior combination characteristics 2.
Optionally, in some embodiments, each behavior relationship identifier in the first number of behavior relationship identifiers reflects that the behavior relationship between the first network behavior and the second network behavior is one of a predetermined third number of behavior relationships, where the third number of behavior relationships includes a fourth number of behavior association relationships and a fourth number of behavior non-association relationships, a difference between the third number of behavior relationships and the fourth number of behavior non-association relationships is equal to 1, and the behavior association relationships may be the foregoing "continuous behavior-non-continuous behavior", "non-continuous behavior", or "continuous behavior-continuous behavior", and may be based on that the behavior non-association relationships, such as "publishing network media data-browsing related network media data" by the network behavior "and performing instant interaction (such as text, voice chat, etc.) with the user a, that is, there is no association relationship between the two network behaviors, that is, the behavior relationship identifiers may be determined as a behavior non-association relationship; in the training process of the corresponding neural network, the actual behavior relation of the training network behavior combination has each behavior relation in the third number of behavior relations, so that the neural network can learn the mapping relation between each training network behavior and the corresponding behavior relation, and the behavior relation can be identified based on the mapping relation in application.
Optionally, in some embodiments, the step of mining out the network behavior features of the first number of first network behaviors in the first number of network behavior combinations to obtain the first number of first network behavior features, and mining out the network behavior features of the first number of second network behaviors in the first number of network behavior combinations to obtain the first number of second network behavior features may further include the following specific implementable contents:
mining network behavior characteristics of an a-th first network behavior in the first number of first network behaviors to obtain the a-th first network behavior characteristics, wherein the a-th first network behavior belongs to one of the first number of first network behaviors (each first network behavior in the first number of first network behaviors can be sequentially or parallelly:
if the a-th first network behavior comprises a fifth number of behavior description words, determining word embedding characteristics of each of the fifth number of behavior description words to obtain a fifth number of word embedding characteristics, wherein the fifth number of behavior description words can be formed by word segmentation processing of a behavior description text of the a-th first network behavior, if the behavior description text of the a-th first network behavior is a word embedding characteristic of each behavior description word, the word embedding characteristic of each behavior description word can be obtained by word segmentation processing of the a-th first network behavior, and if the behavior description text of the a-th first network behavior is a word embedding characteristic of the a-th first network behavior, the word embedding characteristic of each behavior description word is obtained by word embedding processing of the a-th first network behavior;
Fusing the fifth number of word embedding features, outputting an a first network behavior feature, for example, the fifth number of word embedding features may be cascaded (spliced), and the a first network behavior feature may be obtained, for example { word embedding feature 1, word embedding feature 2, word embedding feature 3, word embedding feature 4, word embedding feature 5.
And mining out the network behavior characteristics of the a-th second network behavior in the first number of second network behaviors based on the following operation, so as to obtain the a-th second network behavior characteristics, wherein the a-th second network behavior belongs to one of the first number of second network behaviors:
if the a second network behavior comprises a sixth number of behavior description words, determining word embedding characteristics of each behavior description word in the sixth number of behavior description words to obtain a sixth number of word embedding characteristics, as described in the previous related description;
and merging the embedded features of the sixth number of words, and outputting an a second network behavior feature as described in the previous related description.
For example:
assume that there are four words of embedded feature vectors, respectively:
Word embedding feature of "time": [0.123, 0.456, 0.789, ], 0.987];
word embedding feature of "release": [0.246, 0.852, 0.394, ], 0.761];
word embedding feature of "network media data": [0.753, 0.159, 0.468, ];
"browsed" word embedding feature: [0.632, 0.971, 0.185, ];
the embedded features of these words are spliced and can be sequentially connected into a new feature vector. The following is shown:
first network behavior feature= [0.123, 0.456, 0.789, ], 0.987, 0.246, 0.852, 0.394, ], 0.761, 0.753, 0.159, 0.468, ], 0.624, 0.632, 0.971, 0.185, ], 0.524;
thus, by stitching the embedded features of the words, a feature representation of the first network behavior may be obtained. The feature vector contains embedded features for each word and the order relationship between them, which can be used for further data analysis.
Wherein optionally, in some embodiments, the step of merging the fifth number of word embedding features and outputting the a first network behavior feature includes:
according to the precedence relation of the behavior description words corresponding to the fifth number of word embedding features in the behavior description text of the a first network behavior, sequencing the fifth number of word embedding features to form a word embedding feature sequence;
Traversing each word embedding feature in the word embedding feature sequence in sequence;
for the word embedding feature which is currently traversed, when the word embedding feature which is currently traversed is any word embedding feature other than the first word embedding feature, acquiring target word features corresponding to each other word embedding feature which is positioned before the word embedding feature which is currently traversed in the word embedding feature sequence, so as to combine and form a related word embedding feature sequence corresponding to the word embedding feature, wherein the related word embedding feature sequence of the first word embedding feature comprises one word embedding feature which is the first word embedding feature;
splicing all relevant word embedding features in the relevant word embedding feature sequence to form spliced word embedding features corresponding to the word embedding features traversed currently, and carrying out pooling processing on the spliced word embedding features to obtain pooled word embedding features, wherein the pooled word embedding features are consistent with the feature sizes and dimensions of the word embedding features traversed currently;
respectively carrying out convolution processing on the currently traversed word embedding feature and the pooled word embedding feature, and outputting a corresponding first word convolution feature and a corresponding second word convolution feature, wherein the first word convolution feature corresponds to the currently traversed word embedding feature, the second word convolution feature corresponds to the pooled word embedding feature, the convolution processing can be realized through a trained and converged convolution neural network, and in the embodiment, other neural networks are trained and converged, wherein a specific training process is not limited specifically;
Multiplying the transposed feature of the first word convolution feature and the second word convolution feature to output a corresponding similarity parameter distribution, weighting the first word convolution feature based on the similarity parameter distribution, and outputting a corresponding correlation feature;
fusing the relevant features and the first word convolution features, such as splicing, outputting corresponding associated mining features, superposing a pooling result of the associated mining features and the currently traversed word embedding features, outputting target word features corresponding to the currently traversed word embedding features, wherein the pooling result of the associated mining features and the currently traversed word embedding features have the same feature dimension and feature size;
and after obtaining the target word characteristics corresponding to the last word embedding characteristics, taking the target word characteristics corresponding to the last word embedding characteristics as the a first network behavior characteristics.
Optionally, in some embodiments, the step of mining out the category identification feature of the category identification of the first number of first network behaviors in the first number of behavior category identification combinations to obtain a first number of first network behavior category features, and mining out the category identification feature of the category identification of the first number of second network behaviors in the first number of behavior category identification combinations to obtain a first number of second network behavior category features may further include the following specific implementable contents:
Mining the category identification feature of the behavior category identification of the a-th first network behavior in the first number of first network behaviors to obtain the a-th first network behavior category feature, wherein the a-th first network behavior belongs to one of the first number of first network behaviors (namely, each first network behavior is processed sequentially or in parallel):
if the behavior type identifier of the a-th first network behavior includes an a-th behavior type description text (such as continuous behavior or non-continuous behavior), determining word embedding characteristics of the a-th behavior type description text (such as word segmentation processing is performed on the continuous behavior or the non-continuous behavior respectively, then the obtained words are subjected to embedding processing, finally the words can be cascaded to obtain corresponding word embedding characteristics, and the a-th first network behavior type characteristics are determined to be the word embedding characteristics of the a-th behavior type description text;
and mining the category identification feature of the behavior category identification of the a-th second network behavior in the first number of second network behaviors to obtain the a-th second network behavior category feature, wherein the a-th second network behavior belongs to one of the first number of second network behaviors (namely, each second network behavior is processed sequentially or in parallel):
And if the behavior type identifier of the a second network behavior comprises an a-th behavior type description text, determining word embedding characteristics of the a-th behavior type description text, and determining the a-th second network behavior type characteristics as the word embedding characteristics of the a-th behavior type description text, as described in the previous related description.
Optionally, in some embodiments, the step of aggregating the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior features, and the first number of second network behavior category features to form a first number of network behavior combination features corresponding to the first number of network behavior combinations may further include the following specific implementable contents:
forming an a-th network behavior combination feature corresponding to an a-th network behavior combination in the first number of network behavior combinations based on aggregation of an a-th first network behavior and an a-th second network behavior, wherein the a-th network behavior combination belongs to one of the first number of network behavior combinations (namely, each network behavior combination in the first number of network behavior combinations is sequentially or parallelly processed):
Aggregating a first network behavior feature, a first network behavior type feature, a second network behavior feature and a second network behavior type feature to obtain a network behavior combination feature, wherein the a first network behavior feature is used for representing the a first network behavior, the a first network behavior type feature is used for representing a behavior type identifier of the a first network behavior, the a second network behavior feature is used for representing the a second network behavior, and the a second network behavior type feature is used for representing a behavior type identifier of the a second network behavior; that is, the network behavior combination feature includes two specific behavior semantic features and behavior type features of the network behavior.
Optionally, in some embodiments, the step of aggregating the a-th first network behavior feature, the a-th first network behavior category feature, the a-th second network behavior category feature, and obtaining the a-th network behavior combination feature may further include the following specific implementable contents:
aggregating the a-th first network behavior feature, the a-th first network behavior type feature, the a-th second network behavior feature and the a-th second network behavior type feature in the direction of feature dimension to form an a-th network behavior combination feature; or alternatively
Determining a first set of first coordinate embedding features for characterizing a first set of coordinates (locations) of the network big data set; and aggregating the first coordinate embedded feature, the a first network behavior type feature, the a second network behavior feature and the a second network behavior type feature in the direction of feature dimension to form an a network behavior combination feature, wherein the network behaviors in the first number of network behavior combinations are network behaviors extracted from the network big data set.
To facilitate understanding of the above "aggregation in the direction of the feature dimension", examples are:
the feature dimension of the a first network behavior feature is 1×654, the feature dimension of the a first network behavior type feature is 1×654, the feature dimension of the a second network behavior type feature is 1×654, the a first network behavior feature, the a first network behavior type feature, the a second network behavior type feature and the a second network behavior type feature are cascaded (aggregated), and the feature dimension of the obtained a network behavior combination feature is 4×654; where aggregate set first coordinate embedded features are required, the aggregate set first coordinate embedded features may also be mapped to 1 x 654.
Optionally, in some embodiments, the step of analyzing the first number of behavior relation identifications of the first number of network behavior combinations based on the first number of network behavior combination features may further include the following specific implementable contents:
analyzing an a-th behavior relation identification of an a-th network behavior combination in the first number of network behavior combinations according to an a-th network behavior combination feature in the first number of network behavior combination features, wherein the a-th network behavior combination feature belongs to one of the first number of network behavior combination features (namely, each network behavior combination feature in the first number of network behavior combination features is sequentially or parallelly subjected to subsequent identification processing):
loading the a-th network behavior combination feature into a behavior relation identification network, identifying a third quantity of behavior relation identification data corresponding to a third quantity of behavior relation identifications, wherein the third quantity of behavior relation identification data is used for reflecting the possibility that the a-th behavior relation identification is each behavior relation identification in the third quantity of predetermined behavior relation identifications, and exemplarily, the a-th network behavior combination feature can be fully connected through the behavior relation identification network to obtain a corresponding fully connected feature, and then, the fully connected feature can be processed through a classification function such as a softmax function to obtain a parameter distribution, wherein the parameter distribution can comprise the possibility that the a-th behavior relation identification is each behavior relation identification in the third quantity of predetermined behavior relation identifications, such as a specific numerical value of 0-1;
Marking the a-th behavior relation identifier as a target behavior relation identifier in the third number of predetermined behavior relation identifiers, wherein the a-th behavior relation identifier is the highest possibility of the target behavior relation identifier in the third number of behavior relation identification data.
With reference to fig. 3, the embodiment of the invention further provides an information mining device based on a big data set, which can be applied to the information mining system based on the big data set. Wherein, the big data set based information mining apparatus may include:
the network behavior data extraction module is used for extracting a first number of network behavior combinations and a first number of behavior type identification combinations from a network big data set, wherein each network behavior combination in the first number of network behavior combinations comprises a first network behavior and a second network behavior which need to identify behavior relation identifications, each first network behavior in the first number of network behavior combinations and each behavior type identification of each second network behavior in the first number of network behavior combinations, each network behavior combination comprises a first network behavior distributed in the front and a second network behavior distributed in the back, each first network behavior belongs to one network behavior in the network big data set, each second network behavior in the network big data set is obtained by carrying out network monitoring on target network users;
The behavior relation analysis module is used for analyzing a first number of behavior relation identifications of the first number of network behavior combinations based on a first number of first network behavior characteristics, a first number of first network behavior type characteristics, a first number of second network behavior characteristics and a first number of second network behavior type characteristics, wherein the first number of first network behavior characteristics are used for representing the first number of first network behaviors in the first number of network behavior combinations, the first number of first network behavior type characteristics are used for representing behavior type identifications of the first number of first network behaviors, the first number of second network behavior characteristics are used for representing the first number of second network behaviors in the first number of network behavior combinations, and the first number of second network behavior type characteristics are used for representing the behavior type identifications of the first number of second network behaviors;
the network behavior change monitoring module is configured to generate network behavior information change data of the target network user if the network behavior information of the target network user changes according to a set of behavior relation identifiers in the first number of behavior relation identifiers.
In summary, according to the information mining method and system based on the big data set provided by the invention, the first number of network behavior combinations and the first number of behavior type identification combinations in the network big data set are extracted; analyzing a first number of behavior relation identifiers of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior type features, the first number of second network behavior features and the first number of second network behavior type features; if the network behavior information of the target network user is changed according to a group of behavior relation identifiers in the first number of behavior relation identifiers, corresponding network behavior information change data is generated. Based on the foregoing, since the analysis and recognition of the behavior relationship are performed on the network behavior combination, the data size or complexity of the object to be analyzed and recognized is relatively reduced, so that the efficiency of mining network behavior information can be improved to a certain extent, and the problem of relatively low mining efficiency in the prior art is solved.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An information mining method based on big data sets is characterized by comprising the following steps:
extracting a first number of network behavior combinations and a first number of behavior type identification combinations from a network big data set, wherein each network behavior combination in the first number of network behavior combinations comprises a first network behavior and a second network behavior which need to identify behavior relation identifications, each first network behavior in the first number of network behavior combinations and each second network behavior type identification in the first number of network behavior combinations, each network behavior combination comprises a first network behavior distributed in the front and a second network behavior distributed in the back, each first network behavior belongs to one network behavior in the network big data set, and each network behavior in the network big data set is obtained by carrying out network monitoring on a target network user;
analyzing a first number of behavior relation identifications of the first number of network behavior combinations based on a first number of first network behavior features, a first number of first network behavior type features, a first number of second network behavior features, and a first number of second network behavior type features, wherein the first number of first network behavior features are used for characterizing a first number of first network behaviors in the first number of network behavior combinations, the first number of first network behavior type features are used for characterizing behavior type identifications of the first number of first network behaviors, the first number of second network behavior features are used for characterizing a first number of second network behaviors in the first number of network behavior combinations, and the first number of second network behavior type features are used for characterizing behavior type identifications of the first number of second network behaviors;
And if the network behavior information of the target network user changes according to the group of behavior relation identifications in the first number of behavior relation identifications, generating network behavior information change data of the target network user.
2. The big data set based information mining method of claim 1, wherein the step of extracting a first number of network behavior combinations and a first number of behavior class identification combinations from the network big data set comprises:
extracting a second number of network behaviors and a second number of behavior type identifiers which are in one-to-one correspondence from a network big data set, wherein the second number of behavior type identifiers comprise behavior type identifiers of each network behavior in the second number of network behaviors, and each network behavior in the second number of network behaviors belongs to one network behavior in the network big data set;
combining a first number of paired two network behaviors in the second number of network behaviors to form a first number of network behavior combinations, wherein each network behavior combination in the first number of network behavior combinations comprises two paired network behaviors in the second number of network behaviors, the network behaviors distributed in the front in the two paired network behaviors serve as a first network behavior, and the network behaviors distributed in the front serve as a second network behavior;
And extracting the behavior type identifications of the first network behaviors and the second network behaviors included in each network behavior combination in the first number of network behavior combinations from the second number of behavior type identifications to form a first number of behavior type identification combinations.
3. The big data set based information mining method according to claim 1 or 2, wherein the step of analyzing the first number of behavior relation identifications of the first number of network behavior combinations based on the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior category features comprises:
mining network behavior characteristics of a first number of first network behaviors in the first number of network behavior combinations to obtain first number of first network behavior characteristics, and mining network behavior characteristics of a first number of second network behaviors in the first number of network behavior combinations to obtain first number of second network behavior characteristics;
digging out the category identification characteristics of the category identification of the first number of first network behaviors in the first number of behavior category identification combinations to obtain a first number of first network behavior category characteristics, and digging out the category identification characteristics of the category identification of the first number of second network behaviors in the first number of behavior category identification combinations to obtain a first number of second network behavior category characteristics;
Aggregating the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior features, and the first number of second network behavior category features to form a first number of network behavior combination features corresponding to the first number of network behavior combinations;
based on the first number of network behavior combination characteristics, a first number of behavior relation identifications of the first number of network behavior combinations are analyzed, and each behavior relation identification in the first number of behavior relation identifications is used for reflecting behavior relation content between the first network behavior and the second network behavior in a corresponding network behavior combination in the first number of network behavior combinations.
4. The big data set based information mining method of claim 3, wherein each of the first number of behavior relationship identifications reflects one of a third number of behavior relationships among the first network behavior and the second network behavior, the third number of behavior relationships including a fourth number of behavior associations and no behavior associations, a difference between the third number and the fourth number being equal to 1.
5. The method for mining information based on big data set as recited in claim 3, wherein the step of mining out the network behavior features of the first number of first network behaviors in the first number of network behavior combinations to obtain the first number of first network behavior features, and mining out the network behavior features of the first number of second network behaviors in the first number of network behavior combinations to obtain the first number of second network behavior features, comprises:
mining network behavior characteristics of an a-th first network behavior in the first number of first network behaviors based on the following operation to obtain the a-th first network behavior characteristics, wherein the a-th first network behavior belongs to one of the first number of first network behaviors:
if the a-th first network behavior comprises a fifth number of behavior description words, determining word embedding characteristics of each behavior description word in the fifth number of behavior description words to obtain a fifth number of word embedding characteristics;
fusing the embedded features of the fifth number of words and outputting a first network behavior feature;
and mining out the network behavior characteristics of the a-th second network behavior in the first number of second network behaviors based on the following operation, so as to obtain the a-th second network behavior characteristics, wherein the a-th second network behavior belongs to one of the first number of second network behaviors:
If the a second network behavior comprises a sixth number of behavior description words, determining word embedding characteristics of each behavior description word in the sixth number of behavior description words to obtain a sixth number of word embedding characteristics;
and merging the embedded features of the sixth number of words and outputting a second network behavior feature.
6. The method for mining information based on a big data set according to claim 3, wherein the step of mining out category identification features of the category identification of the first number of first network behaviors in the first number of category identification combinations to obtain a first number of first network behavior category features, and mining out category identification features of the category identification of the first number of second network behaviors in the first number of category identification combinations to obtain a first number of second network behavior category features, comprises:
mining the category identification feature of the behavior category identification of the a-th first network behavior in the first number of first network behaviors to obtain the a-th first network behavior category feature, wherein the a-th first network behavior belongs to one of the first number of first network behaviors:
If the behavior type identifier of the a-th first network behavior comprises an a-th behavior type description text, determining word embedding characteristics of the a-th behavior type description text, and determining the a-th first network behavior type characteristics as the word embedding characteristics of the a-th behavior type description text;
and mining the category identification feature of the behavior category identification of the a-th second network behavior in the first number of second network behaviors based on the following operation to obtain the a-th second network behavior category feature, wherein the a-th second network behavior belongs to one of the first number of second network behaviors:
and if the behavior type identifier of the a second network behavior comprises an a-th behavior type description text, determining word embedding characteristics of the a-th behavior type description text, and determining the a-th second network behavior type characteristics as the word embedding characteristics of the a-th behavior type description text.
7. The method for mining information based on big data set as recited in claim 3, wherein the step of aggregating the first number of first network behavior features, the first number of first network behavior category features, the first number of second network behavior category features to form a first number of network behavior combination features corresponding to the first number of network behavior combinations comprises:
Forming an a-th network behavior combination feature corresponding to an a-th network behavior combination in the first number of network behavior combinations based on aggregation of an a-th first network behavior and an a-th second network behavior, the a-th network behavior combination belonging to one of the first number of network behavior combinations:
the method comprises the steps of aggregating a first network behavior feature, a first network behavior type feature, a second network behavior feature and a second network behavior type feature to obtain a network behavior combination feature, wherein the a first network behavior feature is used for representing the a first network behavior, the a first network behavior type feature is used for representing behavior type identification of the a first network behavior, the a second network behavior feature is used for representing the a second network behavior, and the a second network behavior type feature is used for representing behavior type identification of the a second network behavior.
8. The method for mining information based on big data set according to claim 7, wherein the step of aggregating the a-th first network behavior feature, the a-th first network behavior category feature, the a-th second network behavior category feature, and obtaining the a-th network behavior combination feature comprises:
Determining a first set of first coordinate embedding features for characterizing a first set of coordinates of the network big data set;
and aggregating the first coordinate embedded feature, the a first network behavior type feature, the a second network behavior feature and the a second network behavior type feature in the direction of feature dimension to form an a network behavior combination feature, wherein the network behaviors in the first number of network behavior combinations are network behaviors extracted from the network big data set.
9. The method for mining information based on a large data set as recited in claim 3, wherein said step of analyzing a first number of behavior relation identifications of said first number of network behavior combinations based on said first number of network behavior combination features comprises:
analyzing an a-th behavior relation identification of an a-th network behavior combination of the first number of network behavior combinations according to an a-th network behavior combination feature of the first number of network behavior combination features, wherein the a-th network behavior combination feature belongs to one of the first number of network behavior combination features, based on the following operations:
Loading the a-th network behavior combination characteristic into a behavior relation identification network, and identifying a third quantity of behavior relation identification data corresponding to a third quantity of behavior relation identifiers, wherein the third quantity of behavior relation identification data is used for reflecting the possibility that the a-th behavior relation identifier is each behavior relation identifier in a third quantity of predetermined behavior relation identifiers;
marking the a-th behavior relation identifier as a target behavior relation identifier in the third number of predetermined behavior relation identifiers, wherein the a-th behavior relation identifier is the highest possibility of the target behavior relation identifier in the third number of behavior relation identification data.
10. A big data set based information mining system, comprising a processor and a memory, the memory for storing a computer program, the processor for executing the computer program to implement the big data set based information mining method of any of claims 1-9.
CN202311228862.5A 2023-09-22 2023-09-22 Information mining method and system based on big data set Active CN116975300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311228862.5A CN116975300B (en) 2023-09-22 2023-09-22 Information mining method and system based on big data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311228862.5A CN116975300B (en) 2023-09-22 2023-09-22 Information mining method and system based on big data set

Publications (2)

Publication Number Publication Date
CN116975300A true CN116975300A (en) 2023-10-31
CN116975300B CN116975300B (en) 2024-01-26

Family

ID=88483528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311228862.5A Active CN116975300B (en) 2023-09-22 2023-09-22 Information mining method and system based on big data set

Country Status (1)

Country Link
CN (1) CN116975300B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107426040A (en) * 2017-09-20 2017-12-01 华中科技大学 A kind of Forecasting Methodology of network behavior
CN108075944A (en) * 2016-11-16 2018-05-25 腾讯科技(深圳)有限公司 A kind of method for monitoring network and device
CN110113368A (en) * 2019-06-27 2019-08-09 电子科技大学 A kind of network behavior method for detecting abnormality based on sub-trajectory mode
CN112380299A (en) * 2020-12-08 2021-02-19 腾讯科技(深圳)有限公司 Relational network construction method, device and storage medium
CN115134329A (en) * 2022-06-29 2022-09-30 中国银行股份有限公司 Network behavior control method and device, electronic equipment and storage medium
CN115964461A (en) * 2022-12-29 2023-04-14 江苏永硕舟钰数据科技有限公司 Network data matching method and platform based on artificial intelligence and big data analysis
CN116304341A (en) * 2023-03-22 2023-06-23 国涛(菏泽牡丹区)网络科技有限责任公司 Fraud discrimination method and system based on user network big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108075944A (en) * 2016-11-16 2018-05-25 腾讯科技(深圳)有限公司 A kind of method for monitoring network and device
CN107426040A (en) * 2017-09-20 2017-12-01 华中科技大学 A kind of Forecasting Methodology of network behavior
CN110113368A (en) * 2019-06-27 2019-08-09 电子科技大学 A kind of network behavior method for detecting abnormality based on sub-trajectory mode
CN112380299A (en) * 2020-12-08 2021-02-19 腾讯科技(深圳)有限公司 Relational network construction method, device and storage medium
CN115134329A (en) * 2022-06-29 2022-09-30 中国银行股份有限公司 Network behavior control method and device, electronic equipment and storage medium
CN115964461A (en) * 2022-12-29 2023-04-14 江苏永硕舟钰数据科技有限公司 Network data matching method and platform based on artificial intelligence and big data analysis
CN116304341A (en) * 2023-03-22 2023-06-23 国涛(菏泽牡丹区)网络科技有限责任公司 Fraud discrimination method and system based on user network big data

Also Published As

Publication number Publication date
CN116975300B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN108021651B (en) Network public opinion risk assessment method and device
CN110458324B (en) Method and device for calculating risk probability and computer equipment
CN111931809A (en) Data processing method and device, storage medium and electronic equipment
CN108241867B (en) Classification method and device
CN113554175A (en) Knowledge graph construction method and device, readable storage medium and terminal equipment
CN114416998A (en) Text label identification method and device, electronic equipment and storage medium
WO2020019489A1 (en) Method for predicting reason for employee resignation and related device
CN112560463B (en) Text multi-labeling method, device, equipment and storage medium
CN113596121A (en) Information analysis method and information analysis system based on cloud computing and big data
CN116975300B (en) Information mining method and system based on big data set
CN115563477B (en) Harmonic data identification method, device, computer equipment and storage medium
CN114444514B (en) Semantic matching model training method, semantic matching method and related device
CN113095073B (en) Corpus tag generation method and device, computer equipment and storage medium
CN116822491A (en) Log analysis method and device, equipment and storage medium
CN114090869A (en) Target object processing method and device, electronic equipment and storage medium
CN113946717A (en) Sub-map index feature obtaining method, device, equipment and storage medium
CN114064872A (en) Intelligent storage method, device, equipment and medium for dialogue data information
CN113779248A (en) Data classification model training method, data processing method and storage medium
CN111815442A (en) Link prediction method and device and electronic equipment
CN116383883B (en) Big data-based data management authority processing method and system
CN113407450B (en) Interface testing method, device, equipment and medium based on parameter automatic identification
CN113807429B (en) Enterprise classification method, enterprise classification device, computer equipment and storage medium
CN113535594B (en) Method, device, equipment and storage medium for generating service scene test case
CN116796133A (en) Data analysis method, device, computer equipment and storage medium
CN112862536A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant