CN114493937A - Analytical model construction method and device based on communication data - Google Patents

Analytical model construction method and device based on communication data Download PDF

Info

Publication number
CN114493937A
CN114493937A CN202111553872.7A CN202111553872A CN114493937A CN 114493937 A CN114493937 A CN 114493937A CN 202111553872 A CN202111553872 A CN 202111553872A CN 114493937 A CN114493937 A CN 114493937A
Authority
CN
China
Prior art keywords
data
communication
event
model
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111553872.7A
Other languages
Chinese (zh)
Inventor
蔡淋强
谢伟超
曾超
林文彬
叶礼让
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN202111553872.7A priority Critical patent/CN114493937A/en
Publication of CN114493937A publication Critical patent/CN114493937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Abstract

The invention provides a method and a device for constructing an analysis model based on communication data, which comprises the steps of collecting the communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor; respectively presetting corresponding model conditions and proportional operation rules for each calculation factor according to event clues, and converting data meeting the model conditions in each calculation factor into corresponding scores through the corresponding proportional operation rules; carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis; and constructing an analysis model based on the communication data according to the steps to analyze the organization related to the event and the members of the organization. The invention can be applied to the analysis of the abnormal entry and exit clues of the carried articles mainly based on communication and the research of other events, greatly improves the analysis capability and the working efficiency of workers and has wide application prospect.

Description

Analytical model construction method and device based on communication data
Technical Field
The invention relates to the technical field of electronic evidence obtaining, in particular to a method and a device for constructing an analysis model based on communication data.
Background
The phenomenon of abnormal entry and exit of carried goods is an international social phenomenon, and as long as foreign trade management is implemented by the country, the phenomenon of abnormal entry and exit of the carried goods inevitably occurs as long as market difference between home and abroad exists. The main technical means of the abnormal portable article entering and exiting the abnormal portable article all is to analyze the communication object based on the call ticket data, but the rules and the characteristics of the abnormal portable article entering and exiting the abnormal portable article are not excavated, a data processing model is not established, the quality of the research and judgment data is low, a large amount of workers are required to be invested for manual analysis and research and judgment, and the working efficiency is greatly reduced.
The prior art has the following defects:
1. the technical means is insufficient and depends on experience mainly. The existing technical means are all based on analysis tools to carry out research and judgment analysis, for example, call bill analysis software is used for analyzing the communication relation between numbers, or bill analysis software is used for analyzing the transfer relation between bank cards, manual judgment is mainly carried out by depending on the experience of information research and judgment personnel, research is not carried out according to the call record rule of abnormally carried articles entering and exiting, key core data are extracted, a data analysis model is established, and the analysis quality and efficiency of the information research and judgment personnel are improved.
2. Although the data can be regularly circulated, the data value is not mined, and the study and judgment model is solidified. The original analysis is mainly aimed at the analysis of the entity of the call object, the value of the data is not effectively mined, the core attribute of the data is not abstracted, and an analysis and judgment model is formed.
The invention provides a model construction method and device based on communication data based on data research of an abnormal article carrying in-and-out call record.
Disclosure of Invention
The invention provides an analytical model construction method and device based on communication data, and aims to overcome the defects in the prior art.
In one aspect, the invention provides an analytical model construction method based on communication data, which comprises the following steps:
s1: collecting and analyzing the communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor;
s2: presetting corresponding model conditions and proportional operation rules for each calculation factor according to event clues, and converting data meeting the model conditions in each calculation factor into corresponding scores through the corresponding proportional operation rules;
s3: carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis;
s4: constructing an analysis model based on the communication data according to the S1 to the S3, and performing a plurality of analyses based on the analysis model, thereby analyzing the event-related organization and the organization member based on the scores of the calculation factors obtained by the plurality of analyses and the total score.
The method analyzes the communication record data, extracts core attribute data, extracts elements including time data, position data, main number data, object number data, communication type data, time continuation, position relation and the like, and establishes a set of automatic analysis model based on the entry and exit of the abnormally carried goods by combining the rule of the entry and exit activities of the abnormally carried goods. The model mainly depends on the elements of the communication recorded data as calculation factors, independent model calculation and model condition percentage setting are carried out on each calculation factor, the calculation result of each model condition is subjected to weighted calculation and is matched with the organization role condition set by the system one by one, and the role label is automatically marked on the clue number accessed by the system according to the model condition. The analysis model constructed by the method greatly improves the analysis capability and the working efficiency of the information research and judgment personnel.
In a specific embodiment, the core attribute data is classified into a plurality of types, where the plurality of types include:
communication time data: data representing the communication time range of the important attention of the event to be analyzed;
the communication position data is data representing the gathering position of the important concerned area of the event to be analyzed;
the data of the main number of the communication link is the related characteristic data of the main number of the communication link which represents the important attention of the event to be analyzed;
the number data of the communication object represents the relevant characteristic data of the number of the communication object which is focused on by the event to be analyzed;
the communication type data is the relevant characteristic data of the communication type which represents the important attention of the event to be analyzed;
communication time duration data representing the duration and disconnection conditions of time nodes which are mainly concerned by the event to be analyzed;
and the relation data of the communication positions are data representing the relation among the area gathering positions of the members in the tissues which are mainly concerned by the event to be analyzed.
In a specific embodiment, when the communication time data is used as a calculation factor, the specific step of S2 includes:
setting a time range interval which needs to be met by data in the operation rule according to a communication time range which is focused in the analysis process of the event to be analyzed, and setting the time range interval as a model condition corresponding to the communication time data;
counting the number of the communication time data meeting the time range interval to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication position data is used as a calculation factor, the specific step of S2 includes:
according to the data of the focused area gathering positions in the analysis process of the event to be analyzed, using a visual map to define the position range of the focused area gathering positions on the visual map, and setting the position range as a model condition corresponding to the communication position data;
counting the number of the communication position data meeting the position range to obtain the times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication position data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the number of the contact key is taken as a calculation factor, the specific step of S2 includes:
setting a main number format range which accords with the relevant characteristics according to the relevant characteristic data of the main number of the communication link which is focused in the analysis process of the event to be analyzed, and setting the main number format range as a model condition corresponding to the main number data of the communication link;
counting the number of the main number data meeting the format range of the main number to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the data of the main number of the communication link according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the number data of the contact object is used as a calculation factor, the specific step of S2 includes:
setting an object number format range which accords with relevant characteristics according to relevant characteristic data of a communication object number which is focused in the analysis process of the event to be analyzed, and setting the object number format range as a model condition corresponding to the communication object number data;
counting the number of the number data of the communication object meeting the object number format range to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the number data of the communication object according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication type data is used as the calculation factor, the specific step of S2 includes:
setting a communication type range which accords with the relevant characteristics according to the relevant characteristic data of the communication type which is focused in the analysis process of the event to be analyzed, and setting the communication type range as a model condition corresponding to the communication type data;
counting the number of the communication type data meeting the range of the communication type to obtain the number of times meeting the condition;
and performing scaling conversion based on the proportional operation rule corresponding to the communication type data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication time duration data is used as a calculation factor, the specific step of S2 includes:
judging whether the time node of the communication record is continued or not according to the continuation and disconnection conditions of the time node which is mainly concerned in the analysis process of the event to be analyzed, so that the communication time is set to be in a non-communication state when exceeding a certain time, and the non-communication state is set to be a model condition corresponding to the communication time continuation data;
counting the number of the communication time duration data meeting the non-communication state to obtain the number of times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the relationship data of the communication position is used as a calculation factor, the specific step of S2 includes:
setting an effective area range according to the focused region gathering position in the analysis process of the event to be analyzed, and when two members in the organization appear simultaneously in the effective area range, indicating that a same-frequency position condition is met, and setting the same-frequency position condition as a model condition corresponding to the communication time duration data;
counting the number of the communication time duration data meeting the same-frequency position condition to obtain the number of times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
According to a second aspect of the present invention, a computer-readable storage medium is proposed, on which a computer program is stored, which computer program, when being executed by a computer processor, carries out the above-mentioned method.
According to a third aspect of the present invention, an analysis model building apparatus based on the common data is provided, the apparatus comprising:
the data extraction and classification module of the communication: the method comprises the steps that the method is configured and used for collecting and analyzing communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor;
the score calculation model construction module: the system is configured to preset corresponding model conditions and proportional operation rules according to event clues for each type of calculation factors, and convert data meeting the model conditions in each type of calculation factors into corresponding scores through the corresponding proportional operation rules;
and the score accumulation module is used for: the configuration is used for carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis;
an automated analysis module: and the system is configured to construct an analysis model based on the communication data according to the communication data extraction and classification module and the score accumulation module, and perform multiple analyses based on the analysis model, so that the event-related organizations and the organization members are analyzed based on the scores of the calculation factors and the total scores obtained by the multiple analyses.
The method analyzes the communication record data, extracts core attribute data including time data, position data, main number data, object number data, communication type data, time continuation, position relation and the like, extracts elements, and establishes a set of automatic analysis model based on the entry and exit of the abnormally carried goods by combining the rule of the entry and exit activities of the abnormally carried goods. The model mainly depends on the elements of the communication recorded data as calculation factors, independent model calculation and model condition percentage setting are carried out on each calculation factor, the calculation result of each model condition is subjected to weighted calculation and is matched with the organization role condition set by the system one by one, and the role label is automatically marked on the clue number accessed by the system according to the model condition. The analysis model constructed by the invention greatly improves the analysis capability and the working efficiency of the information research and judgment personnel.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the invention. Other embodiments and many of the intended advantages of embodiments will be readily appreciated as they become better understood by reference to the following detailed description. Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of a method of analytical model construction based on federation data, in accordance with an embodiment of the present invention;
FIG. 3 is a block diagram of the overall data model and computational factors of an embodiment of the present invention;
FIG. 4 is a block diagram of an apparatus for analytical model construction based on biographical data in accordance with an embodiment of the present invention;
FIG. 5 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which an analytical model construction method based on federation data according to an embodiment of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various applications, such as a data processing application, a data visualization application, a web browser application, etc., may be installed on the terminal devices 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background information processing server that provides support for the communication data presented on the terminal devices 101, 102, 103. The background information processing server may process the acquired scores and generate a processing result (e.g., a total score).
It should be noted that the method provided in the embodiment of the present application may be executed by the server 105, or may be executed by the terminal devices 101, 102, and 103, and the corresponding apparatus is generally disposed in the server 105, or may be disposed in the terminal devices 101, 102, and 103.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 is a flowchart illustrating a method for building an analysis model based on contact data according to an embodiment of the present invention. As shown in fig. 2, the method comprises the steps of:
s1: collecting and analyzing the communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor;
s2: presetting corresponding model conditions and proportional operation rules for each calculation factor according to event clues, and converting data meeting the model conditions in each calculation factor into corresponding scores through the corresponding proportional operation rules;
s3: carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis;
s4: constructing an analysis model based on the communication data according to the S1 to the S3, and performing a plurality of analyses based on the analysis model, thereby analyzing the event-related organization and the organization member based on the scores of the calculation factors obtained by the plurality of analyses and the total score.
In a specific embodiment, the core attribute data is classified into a plurality of types, where the plurality of types include:
communication time data: data representing the communication time range of the important attention of the event to be analyzed;
the communication position data is data representing the gathering position of the important concerned area of the event to be analyzed;
the data of the main number of the communication link is the related characteristic data of the main number of the communication link which represents the important attention of the event to be analyzed;
the number data of the communication object represents the relevant characteristic data of the number of the communication object which is focused on by the event to be analyzed;
the communication type data is the relevant characteristic data of the communication type which represents the important attention of the event to be analyzed;
communication time duration data representing the duration and disconnection conditions of time nodes which are mainly concerned by the event to be analyzed;
and the relation data of the communication positions are data representing the relation among the area gathering positions of the members in the tissues which are mainly concerned by the event to be analyzed.
In a specific embodiment, when the communication time data is used as a calculation factor, the specific step of S2 includes:
setting a time range interval which needs to be met by data in the operation rule according to a communication time range which is focused in the analysis process of the event to be analyzed, and setting the time range interval as a model condition corresponding to the communication time data;
counting the number of the communication time data meeting the time range interval to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication position data is used as a calculation factor, the specific step of S2 includes:
according to the data of the focused area gathering positions in the analysis process of the event to be analyzed, using a visual map to define the position range of the focused area gathering positions on the visual map, and setting the position range as a model condition corresponding to the communication position data;
counting the number of the communication position data meeting the position range to obtain the times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication position data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the number of the contact key is taken as a calculation factor, the specific step of S2 includes:
setting a main number format range which accords with the relevant characteristics according to the relevant characteristic data of the main number of the communication link which is focused in the analysis process of the event to be analyzed, and setting the main number format range as a model condition corresponding to the main number data of the communication link;
counting the number of the main number data meeting the format range of the main number to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the data of the main number of the communication link according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the number data of the contact object is used as a calculation factor, the specific step of S2 includes:
setting an object number format range which accords with relevant characteristics according to relevant characteristic data of a communication object number which is focused in the analysis process of the event to be analyzed, and setting the object number format range as a model condition corresponding to the communication object number data;
counting the number of the number data of the communication object meeting the object number format range to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the number data of the communication object according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication type data is used as the calculation factor, the specific step of S2 includes:
setting a communication type range which accords with the relevant characteristics according to the relevant characteristic data of the communication type which is focused in the analysis process of the event to be analyzed, and setting the communication type range as a model condition corresponding to the communication type data;
counting the number of the communication type data meeting the range of the communication type to obtain the number of times meeting the condition;
and performing scaling conversion based on the proportional operation rule corresponding to the communication type data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the communication time duration data is used as the calculation factor, the specific step of S2 includes:
judging whether the time node of the communication record is continued or not according to the continuation and disconnection conditions of the time node which is mainly concerned in the analysis process of the event to be analyzed, so that the communication time is set to be in a non-communication state when exceeding a certain time, and the non-communication state is set to be a model condition corresponding to the communication time continuation data;
counting the number of the communication time duration data meeting the non-communication state to obtain the number of times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
In a specific embodiment, when the relationship data of the communication position is used as a calculation factor, the specific step of S2 includes:
setting an effective area range according to the focused region gathering position in the analysis process of the event to be analyzed, and when two members in the organization appear simultaneously in the effective area range, indicating that a same-frequency position condition is met, and setting the same-frequency position condition as a model condition corresponding to the communication time duration data;
counting the number of the communication time duration data meeting the same-frequency position condition to obtain the times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
The construction of the analytical model of the present invention is illustrated below with a specific embodiment, and FIG. 3 is a diagram of the overall data model and computational factor structure of a specific embodiment of the present invention.
The respective calculation factors shown in fig. 3 and the corresponding score calculation methods thereof are explained as follows:
1. communication time
Taking the time point of the communication record as a calculation factor, for example, setting the model condition to be 0 to 5 in the morning, performing data calculation through the following formula, firstly judging whether the model condition is met, performing interval range data calculation on the basis, and obtaining the proportional value of the model condition by referring to the following formula:
(1) r ═ time (D) epsilon [ 0-5 ]. D is the communication time, the set model condition is used for judging, and R is the condition judgment result;
(2) c ═ count (r). C is the number of times that the condition is satisfied;
(3)S=(C>=10):15?(C>=5):10:(C>=1):5:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
2. Position of communication
The method comprises the following steps of taking position data recorded by the communication as a calculation factor, delineating the position range of a key area gathering place through a visual map, controlling the effective range within 500 meters, judging whether longitude and latitude data meet model conditions, accumulating the times of meeting the model conditions on the basis to calculate the score, and obtaining the proportional score of the model conditions by referring to the following formula:
(1)d=R*arcos[cos(y1)*cos(y2)*cos(x1-x2)+sin(y1)*sin(y2)];
d is the distance between two positions, R is the earth radius 6371km, x1, y1 is the longitude and latitude of the communication position, and x2 and y2 are the longitude and latitude of the focus area gathering place set by the model condition;
(2) c ═ Count (d ═ 500). C is the number of times that the condition is satisfied;
(3)S=(C>=10):5?(C>=1):3:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
3. Same frequency position (relationship data of communication position)
The method comprises the following steps of taking communication position data of organization members as calculation factors, judging whether the communication positions of the organization members are in the same region range, comparing communication longitude and latitude data of each team member, judging whether a model condition is met or not by judging whether an effective range is within a 500 m range as a same-frequency position, calculating interval range data on the basis, and obtaining a proportional score of the model condition by referring to the following formula:
(1)d=R*arcos[cos(y1)*cos(y2)*cos(x1-x2)+sin(y1)*sin(y2)];
wherein d is the distance between the communication positions of two organization members, R is the earth radius 6371km, x1, y1 is the longitude and latitude of the communication position of the organization member 1, x2 and y2 are the longitude and latitude of the communication position of the organization member 2, and so on, the same frequency position data of the communication of the organization members is obtained;
(2) c ═ Count (d ═ 500). C is the number of times that the condition is satisfied;
(3)S=(C>=10):5?(C>=1):3:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
4. Main number (working machine)
The main number of the communication record is used as a calculation factor, for example, a model condition is set as the beginning of an overseas number (00852, 00853, 00886, and the like) or a virtual number segment number (165, 167, 170, 171, 184, and the like), data calculation is performed by the following formula, whether the model condition is met is judged, and the main number can be set as a working machine if the model condition is met. And meanwhile, interval range data calculation is carried out on the basis to obtain the proportional score of the model condition. The formula is as follows:
S=R1:10R2:5:0。
and S is a calculation result, R1 meets the overseas number condition, R2 meets the virtual number segment number condition, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score S is converted into the proportion of the organization role according to the proportion of full score.
5. Communication time duration (taking time fault state as an example)
The method comprises the steps of taking the communication time of communication records as a calculation factor, judging whether time nodes of the communication records are continued, setting a model condition as whether the communication (fault) is not communicated for more than 24 hours, carrying out data calculation through the following formula, carrying out interval range data calculation on the basis, and obtaining the proportional score of the model condition by referring to the following formula:
(1)h=d1-d2;
h is time difference, and is taken as a unit of hour, d1 is the communication time of the communication record 1, d2 is the communication time of the communication record 2, and the communication time is sequentially calculated according to the time sequence;
(2) c ═ Count (h ═ 24). C is the number of times that the condition is satisfied;
(3)S=(C>=3):10?(C>=1):5:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
6. Communication object (taking satellite number as an example)
Taking the number of the opposite party recorded in the communication as a calculation factor, for example, setting the model condition as the beginning of the number type of the opposite party is a satellite number (134925, 134930, 0087, 0580, 1749, etc.), performing data calculation by the following formula, and performing interval range data calculation on the basis of the data calculation to obtain the proportional score of the model condition, wherein the formula is as follows:
(1) c ═ count (r). C is the number of times meeting the condition, and R is the condition judgment of the number of the communication object;
(2)S=(C>=5):10?(C>=1):5:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
7. Communication type (taking short message transfer as an example)
The communication type of the communication record is used as a calculation factor, the communication type is judged to be a short message and a bank short message (such as the number of 95533) simultaneously, the communication type is accompanied with the communication record within 30 minutes, the communication type is judged to be a model condition, interval range data calculation is carried out on the basis of meeting the model condition, and the proportional score of the model condition is obtained by referring to the following formula:
(1) c ═ count (r). C is the number of times meeting the conditions, R is the set model conditions and meets the short message transfer and the association record;
(2)S=(C>=3):10?(C>=1):5:0。
and S is a calculation result, the score S calculated according to the model condition formula is accumulated to the score of each organization role label, and the score is converted into the proportion of the organization role according to the proportion of full score.
Through the condition setting of the abnormal carried article in and out and export analysis model, the communication record data is automatically analyzed for analysis and early warning, potential organization members are automatically mined, and effective data support is provided for intelligent studying, judging and analyzing of the abnormal carried article in and out and export.
The analysis aiming at the entry and exit of the abnormally carried goods in the market at present mainly takes an analysis tool as a main part, the concerned object information and relation are found out through the analysis tool, the scheme is researched aiming at the data of the abnormally carried goods entry and exit model, the data research and processing method based on the abnormally carried goods entry and exit model is provided, the data is automatically analyzed by constructing an automatic analysis model, and a good effect is achieved in the intelligent research and judgment analysis for striking the entry and exit of the abnormally carried goods by combining the law and the characteristics of the abnormally carried goods entry and exit organization.
The method and the device can be applied to the analysis of the entry and exit clues of abnormally carried articles mainly based on communication, and can also be applied to other event researches based on the model, such as economic events mainly based on fund transactions, only the adjustment of relevant data parameters is needed to be carried out on the model, and the application prospect is wide.
FIG. 4 shows a block diagram of an apparatus for analytical model construction based on federation data, in accordance with an embodiment of the present invention. The device comprises a UNICOM data extraction and classification module 401, a score calculation model construction module 402, a score accumulation module 403 and an automatic analysis module 404.
In a specific embodiment, the contact data extraction and classification module 401 is configured to collect and analyze contact data related to clues of an event to be analyzed, extract core attribute data in the contact data, classify the core attribute data into multiple types, and use each type as a calculation factor;
the score calculation model construction module 402 is configured to preset corresponding model conditions and proportional operation rules according to event cues for each calculation factor, and convert data meeting the model conditions in each calculation factor into corresponding scores through the corresponding proportional operation rules;
the score accumulation module 403 is configured to perform weighted accumulation calculation on the scores corresponding to the calculation factors to obtain a total score of the analysis;
the automated analysis module 404 is configured to build an analysis model based on the contact data from the contact data extraction classification module to the score accumulation module, and perform multiple analyses based on the analysis model, so as to analyze the event-related organization and the members of the organization based on the scores of the calculation factors and the total scores obtained by the multiple analyses.
The device analyzes the data of the communication record, extracts core attribute data, extracts elements including time data, position data, main number data, object number data, communication type data, time continuation, position relation and the like, and establishes a set of automatic analysis model based on the entry and exit of the abnormally carried goods by combining the rules of the entry and exit activities of the abnormally carried goods. The model mainly depends on the elements of the communication recorded data as calculation factors, independent model calculation and model condition percentage setting are carried out on each calculation factor, the calculation result of each model condition is subjected to weighted calculation and is matched with the organization role condition set by the device one by one, and the role label is automatically marked on the clue number accessed by the system according to the model condition. The analysis model constructed by the device greatly improves the analysis capability and the working efficiency of the information research and judgment personnel.
Referring now to FIG. 5, shown is a block diagram of a computer system 500 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 5, the computer system 500 includes a Central Processing Unit (CPU)501 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage section 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 505. An input/output (I/O) interface 505 is also connected to bus 504.
The following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, and the like; an output portion 507 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage portion 508 including a hard disk and the like; and a communication section 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the internet. The driver 510 is also connected to the I/O interface 505 as necessary. A removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 510 as necessary, so that a computer program read out therefrom is mounted into the storage section 508 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 509, and/or installed from the removable medium 511. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 501. It should be noted that the computer readable storage medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present application may be implemented by software or hardware. The units described may also be provided in a processor, and the names of the units do not in some cases constitute a limitation of the unit itself.
Embodiments of the present invention also relate to a computer-readable storage medium having stored thereon a computer program which, when executed by a computer processor, implements the method above. The computer program comprises program code for performing the method illustrated in the flow chart. It should be noted that the computer readable medium of the present application can be a computer readable signal medium or a computer readable medium or any combination of the two.
The method analyzes the communication record data, extracts core attribute data including time data, position data, main number data, object number data, communication type data, time continuation, position relation and the like, extracts elements, and establishes a set of automatic analysis model based on the entry and exit of the abnormally carried goods by combining the rule of the entry and exit activities of the abnormally carried goods. The model mainly depends on the elements of the communication recorded data as calculation factors, independent model calculation and model condition percentage setting are carried out on each calculation factor, the calculation result of each model condition is subjected to weighted calculation and is matched with the organization role condition set by the system one by one, and the role label is automatically marked on the clue number accessed by the system according to the model condition. The analysis model constructed by the method greatly improves the analysis capability and the working efficiency of the information research and judgment personnel.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (11)

1. An analytical model construction method based on communication data is characterized by comprising the following steps:
s1: collecting and analyzing the communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor;
s2: presetting corresponding model conditions and proportional operation rules for each calculation factor according to event clues, and converting data meeting the model conditions in each calculation factor into corresponding scores through the corresponding proportional operation rules;
s3: carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis;
s4: constructing an analysis model based on the communication data according to the S1 to the S3, and performing a plurality of analyses based on the analysis model, thereby analyzing the event-related organization and the organization member based on the scores of the calculation factors obtained by the plurality of analyses and the total score.
2. The method of claim 1, wherein the core attribute data is classified into a plurality of types, the plurality of types comprising:
communication time data: data representing the communication time range of the important attention of the event to be analyzed;
the communication position data is data representing the gathering position of the important concerned area of the event to be analyzed;
the data of the main number of the communication link is the related characteristic data of the main number of the communication link which represents the important attention of the event to be analyzed;
the number data of the communication object represents the relevant characteristic data of the number of the communication object which is focused on by the event to be analyzed;
the communication type data is the relevant characteristic data of the communication type which represents the important attention of the event to be analyzed;
communication time duration data representing the duration and disconnection conditions of time nodes which are mainly concerned by the event to be analyzed;
and the relation data of the communication positions are data representing the relation among the area gathering positions of the members in the tissues which are mainly concerned by the event to be analyzed.
3. The method according to claim 2, wherein when the communication time data is used as a calculation factor, the specific step of S2 includes:
setting a time range interval which needs to be met by data in the operation rule according to a communication time range which is focused in the analysis process of the event to be analyzed, and setting the time range interval as a model condition corresponding to the communication time data;
counting the number of the communication time data meeting the time range interval to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time data according to the times meeting the conditions to obtain corresponding scores.
4. The method according to claim 2, wherein when the communication position data is used as a calculation factor, the specific step of S2 includes:
according to the data of the focused area gathering positions in the analysis process of the event to be analyzed, using a visual map to define the position range of the focused area gathering positions on the visual map, and setting the position range as a model condition corresponding to the communication position data;
counting the number of the communication position data meeting the position range to obtain the times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication position data according to the times meeting the conditions to obtain corresponding scores.
5. The method as claimed in claim 2, wherein when the number of the contact key is taken as a calculation factor, the specific step of S2 includes:
setting a main number format range which accords with the relevant characteristics according to the relevant characteristic data of the main number of the communication link which is focused in the analysis process of the event to be analyzed, and setting the main number format range as a model condition corresponding to the main number data of the communication link;
counting the number of the main number data meeting the format range of the main number to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the data of the main number of the communication link according to the times meeting the conditions to obtain corresponding scores.
6. The method according to claim 2, wherein when the wildcard number data is used as the calculation factor, the specific step of S2 comprises:
setting an object number format range which accords with relevant characteristics according to relevant characteristic data of the number of the communication object which is focused in the analysis process of the event to be analyzed, and setting the object number format range as a model condition corresponding to the number data of the communication object;
counting the number of the number data of the communication object meeting the object number format range to obtain the number of times meeting the conditions;
and carrying out proportion conversion based on the proportion operation rule corresponding to the number data of the communication object according to the times meeting the conditions to obtain corresponding scores.
7. The method according to claim 2, wherein when the communication type data is used as a calculation factor, the specific step of S2 includes:
setting a communication type range which accords with the relevant characteristics according to the relevant characteristic data of the communication type which is focused in the analysis process of the event to be analyzed, and setting the communication type range as a model condition corresponding to the communication type data;
counting the number of the communication type data meeting the range of the communication type to obtain the number of times meeting the condition;
and performing scaling conversion based on the proportional operation rule corresponding to the communication type data according to the times meeting the conditions to obtain corresponding scores.
8. The method according to claim 2, wherein when the communication time duration data is used as a calculation factor, the specific step of S2 includes:
judging whether the time node of the communication record is continued or not according to the continuation and disconnection conditions of the time node which is mainly concerned in the analysis process of the event to be analyzed, so that the communication time is set to be in a non-communication state when exceeding a certain time, and the non-communication state is set to be a model condition corresponding to the communication time continuation data;
counting the number of the communication time duration data meeting the non-communication state to obtain the number of times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
9. The method according to claim 2, wherein when the relation data of the communication position is used as a calculation factor, the specific step of S2 includes:
setting an effective area range according to the focused region gathering position in the analysis process of the event to be analyzed, and when two members in the organization appear simultaneously in the effective area range, indicating that a same-frequency position condition is met, and setting the same-frequency position condition as a model condition corresponding to the communication time duration data;
counting the number of the communication time duration data meeting the same-frequency position condition to obtain the times meeting the condition;
and carrying out proportion conversion based on the proportion operation rule corresponding to the communication time duration data according to the times meeting the conditions to obtain corresponding scores.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a computer processor, carries out the method of any one of claims 1 to 9.
11. An analytical model building device based on communication data is characterized by comprising:
the data extraction and classification module of the communication: the method comprises the steps that the method is configured and used for collecting and analyzing communication data related to clues of events to be analyzed, extracting core attribute data in the communication data, classifying the core attribute data into multiple types, and taking each type as a calculation factor;
the score calculation model construction module: the system is configured to preset corresponding model conditions and proportional operation rules according to event clues for each type of calculation factors, and convert data meeting the model conditions in each type of calculation factors into corresponding scores through the corresponding proportional operation rules;
and the score accumulation module is used for: the configuration is used for carrying out weighted accumulation calculation on the scores corresponding to the calculation factors to obtain the total score of the analysis;
an automated analysis module: and the system is configured to construct an analysis model based on the communication data according to the communication data extraction and classification module and the score accumulation module, and perform multiple analyses based on the analysis model, so that the event-related organizations and the organization members are analyzed based on the scores of the calculation factors and the total scores obtained by the multiple analyses.
CN202111553872.7A 2021-12-17 2021-12-17 Analytical model construction method and device based on communication data Pending CN114493937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111553872.7A CN114493937A (en) 2021-12-17 2021-12-17 Analytical model construction method and device based on communication data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111553872.7A CN114493937A (en) 2021-12-17 2021-12-17 Analytical model construction method and device based on communication data

Publications (1)

Publication Number Publication Date
CN114493937A true CN114493937A (en) 2022-05-13

Family

ID=81493142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111553872.7A Pending CN114493937A (en) 2021-12-17 2021-12-17 Analytical model construction method and device based on communication data

Country Status (1)

Country Link
CN (1) CN114493937A (en)

Similar Documents

Publication Publication Date Title
CN107145586B (en) Label output method and device based on electric power marketing data
CN112256887B (en) Intelligent supply chain management method based on knowledge graph
US20060098647A1 (en) Monitoring and reporting enterprise data using a message-based data exchange
CN105354616A (en) Processing device and on-line processing method for electric power measurement asset data
CN114648393A (en) Data mining method, system and equipment applied to bidding
CN113205402A (en) Account checking method and device, electronic equipment and computer readable medium
CN108154311A (en) Top-tier customer recognition methods and device based on random forest and decision tree
CN111062799A (en) Method and device for managing family client, electronic equipment and storage medium
CN110675078A (en) Marketing company risk diagnosis method, system, computer terminal and storage medium
CN114092056A (en) Project management method, device, electronic equipment, storage medium and product
CN111800292A (en) Early warning method and device based on historical flow, computer equipment and storage medium
CN107977855A (en) A kind of method and device of managing user information
CN113902449A (en) Enterprise online transaction system risk early warning method and device and electronic equipment
CN112950359A (en) User identification method and device
CN115567563B (en) Comprehensive transportation hub monitoring and early warning system based on end edge cloud and control method thereof
JP2022534160A (en) Methods and devices for outputting information, electronic devices, storage media, and computer programs
KR20210155501A (en) Receivable recovery support system for medium-small enterprise account receivable bond decrease and bad debt prevention based on big data
CN112001322A (en) Method and device for determining tag personnel gathering and storage medium
CN114493937A (en) Analytical model construction method and device based on communication data
CN116228429A (en) Method and device for detecting transaction data
CN113344638B (en) Power grid user group portrait construction method and device based on hypergraph
CN114357523A (en) Method, device, equipment, storage medium and program product for identifying risk object
CN112598499A (en) Method and device for determining credit limit
KR101909138B1 (en) Receivable recovery support system for medium-small enterprise account receivable bond decrease and bad debt prevention based on big data
CN110059234A (en) Water utilities anomalous event method for detecting and device, computer installation and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination