CN113535702A - Data processing method, device, equipment and storage medium - Google Patents
Data processing method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN113535702A CN113535702A CN202110829558.0A CN202110829558A CN113535702A CN 113535702 A CN113535702 A CN 113535702A CN 202110829558 A CN202110829558 A CN 202110829558A CN 113535702 A CN113535702 A CN 113535702A
- Authority
- CN
- China
- Prior art keywords
- user information
- target
- current network
- data
- dimension
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 49
- 238000012545 processing Methods 0.000 claims abstract description 42
- 230000010365 information processing Effects 0.000 claims abstract description 35
- 230000008569 process Effects 0.000 claims abstract description 33
- 238000000605 extraction Methods 0.000 claims description 24
- 238000013507 mapping Methods 0.000 claims description 13
- 230000032683 aging Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000005065 mining Methods 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The embodiment of the invention discloses a data processing method, a data processing device, data processing equipment and a storage medium. The method comprises the following steps: acquiring current network flow data generated in the network using process of a user; acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing the current network traffic data and an associated dimension of the target dimension; when the target user information of the current network flow data in the target dimension is lost, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension; the knowledge base stores user information of the same user in a target dimension and an associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time; and backfilling the target user information into the current network flow data correspondingly. The invention backfills the missing target user information in the network flow data through the learning accumulated knowledge base, thereby achieving the effect of making the network flow data more complete.
Description
Technical Field
The present application relates to the field of data processing, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the continuous development of network application communication technology, the requirement for mining network traffic data is more and more common. In the prior art, mining processing of network traffic data is mainly realized by performing standardized data processing on the network traffic data and filtering problem data.
In the process of implementing the invention, the inventor finds that the prior art has the following defects: under the condition that the original network traffic data is incomplete, the network traffic data obtained only through standardization processing may have a problem of low quality, so that data support cannot be provided for a subsequent service system.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, an apparatus, a device, and a storage medium, so as to solve a problem that network traffic data obtained only through standardized processing may have low quality under a condition that original network traffic data is incomplete by setting an information processing rule and determining target user information in a knowledge base according to the information processing rule.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, including:
the acquisition module is used for acquiring current network flow data generated in the network using process of a user;
the first acquisition module acquires an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
the determining module is used for determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension when the target user information of the current network traffic data in the target dimension is missing; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and the backfilling module is used for backfilling the target user information into the current network flow data correspondingly.
In a third aspect, an embodiment of the present invention further provides a data processing apparatus, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the method in any one of the embodiments when executing the computer program.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method in any one of the embodiments.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
Drawings
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 2 is another flowchart of a data processing method according to a second embodiment of the present invention;
fig. 3 is another flowchart of a data processing method according to a third embodiment of the present invention;
FIG. 4 is a schematic diagram of a data processing apparatus according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data processing apparatus according to a fifth embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Example one
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, where this embodiment is applicable to a situation where mining processing is performed on network traffic data with data loss, and the method may be executed by a data processing apparatus, and specifically includes the following steps:
and S110, acquiring current network flow data generated in the network using process of the user.
In the process of using the internet, the user terminal continuously generates the original internet surfing data through interaction with the server, and the network flow data can be obtained through analyzing the original internet surfing data. The network traffic data may include various information of the user accessing the internet, for example, the various information may include generation time of the piece of network traffic data, user information of a user to which the piece of network traffic data belongs, and internet access operation content of the user (for example, a piece of video is sent in the wechat group). Taking the user using the WeChat as an example, the user information may include a user terminal ID, a micro signal of the user, a location where the user accesses the internet, a mobile phone number of the user, a user name, and related network operator information. Meanwhile, the current network traffic data is the current network traffic data to be processed.
The invention stores the network flow data by using the card, the card is a system for recording or storing information, the card can process the data in batch, and the card has the characteristic of high access speed. Therefore, the network traffic data analyzed in real time can be accessed into the card system, and when the network traffic data needs to be processed, the current network traffic data generated in the network using process of the user is obtained from the card system.
And S120, acquiring an information processing rule.
The information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension. The target dimension may be understood as a dimension in which user information that needs to be focused on is located, and the associated dimension is a dimension in which user information that is associated with user information that needs to be focused on is located. Generally, the user information of the network traffic data in the target dimension may be lost, but the user information of the network traffic data in the associated dimension is not easy to be lost, so that it may be considered to determine the user information lost in the target dimension by using the user information of the network traffic data in the associated dimension. For example, taking the information processing rule as the mobile phone number/micro signal of the user as an example, it is assumed that the mobile phone number of the user is relatively concerned by the subsequent service analysis, and meanwhile, in a general case, it is considered that the micro signal of the user in the network traffic data is not easy to be lost, so that the mobile phone number of the user can be set as a target dimension, the micro signal of the user is set as an associated dimension, and the mobile phone number of the user is determined by using the micro signal of the user. It should be noted that, in the embodiment of the present application, only the information processing rule is taken as the Mobile phone Number/micro signal of the user as an example, in practical application, the corresponding information processing rule may be set in combination with actual service requirements, for example, the set information processing rule may be the micro signal/user terminal ID (for example, International Mobile Subscriber identity Number (IMSI)) of the user.
In this way, when the current network traffic data is processed, the information processing rule for processing the current network traffic data can be acquired from the metadata corresponding to the current network traffic data, so that the target dimension and the associated dimension are acquired.
S130, when the target user information of the current network flow data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension.
Optionally, the process of determining whether the target user information of the current network traffic data in the target dimension is missing may be: judging whether the content of the current network traffic data in the target dimension is empty or not; if yes, determining that the target user information in the current network traffic data is missing; if not, determining that the target user information in the current network traffic data is not missing.
The knowledge base stores user information of the same user in the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time.
In a large amount of network traffic data, for the same user, the quality of part of the network traffic data may be higher, and the quality of part of the network traffic data is lower, so that the missing user information in the network traffic data with lower quality can be filled by referring to the network traffic data with higher quality. Therefore, the user information of the same user in the target dimension and the associated dimension can be learned from the network traffic data by learning the network traffic data with higher quality, so that the user information of a plurality of users in the target dimension and the associated dimension can be stored in the knowledge base after a period of learning accumulation.
Therefore, when the target user information of the current network traffic data in the target dimension is missing, the existing user information can be searched from the knowledge base according to the existing user information of the current network traffic data in the associated dimension, and the user information associated with the existing user information is determined as the target user information of the current network traffic data in the target dimension. Taking a target dimension as a mobile phone number of a user and an associated dimension as a micro signal of the user as an example, meanwhile, assuming that the micro signal in the current network traffic data is 123, when it is determined that target user information of the current network traffic data under the dimension "mobile phone number" is missing, then, based on the micro signal "123", user information associated with the micro signal "123" can be searched from a knowledge base, and the user information is determined as the target user information of the current network traffic data under the dimension "mobile phone number".
And S140, correspondingly backfilling the target user information into the current network traffic data.
The target user information obtained from the knowledge base in S130 is backfilled into the current network traffic data. Continuing with the example in S130 as an example, after determining the target user information of the current network traffic data under the dimension "mobile phone number", the target user information may be refilled into the current network traffic data, so that the current network traffic data does not lack the target user information any more.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
Example two
Fig. 2 is another flowchart of a data processing method provided in the second embodiment of the present invention, and the second embodiment further improves the technical solution of the first embodiment. As shown in fig. 2, a data processing method includes the steps of:
s210, obtaining current network flow data generated in the network using process of the user.
And S220, acquiring an information processing rule.
S230, when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension.
And S240, correspondingly backfilling the target user information into the current network traffic data.
S250, when the target user information of the current network traffic data in the target dimension is not lost, extracting the target user information and the associated user information of the target user information from the existing user information set in the current network traffic data according to the information processing rule.
Taking a mobile phone number with a target dimension as a user and a micro signal with a related dimension as the user as an example, and simultaneously assuming that the micro signal in the current network traffic data is 123, when it is determined that the target user information of the current network traffic data under the dimension 'mobile phone number' is complete, the target user information under the dimension 'mobile phone number' and the existing user information under the dimension 'micro signal' are extracted from the current network traffic information, and the existing user information under the dimension 'micro signal' is the related user information of the target user information.
S260, associating the target user information with the associated user information, and storing the associated data into the knowledge base according to the information processing rule.
And after the target user information and the associated user information are obtained, the target user information and the associated user information are stored in an associated mode according to the information processing rule. Continuing to take the example in S250 as an example, the target user information under the dimension "mobile phone number" and the existing user information under the dimension "micro signal" are stored in the knowledge base in an associated manner, so that the target user information in the subsequent network traffic data is convenient to backfill by using the information in the knowledge base when the target user information is missing. In that
According to the technical scheme of the embodiment, when the target user information of the current network flow data in the target dimension is not lost, the target user information and the associated user information of the target user information in the current network flow data are extracted by using the information processing rule, the extracted target user information and the associated user information are stored in the knowledge base in an associated mode, the user information of a plurality of users in the target dimension and the associated dimension is accumulated in the knowledge base through real-time learning of the network flow data with high quality, and backfill preparation is made for the subsequent network flow data with low quality.
Optionally, a corresponding data aging time may be set for the associated data, and after the data aging time is overtime, the associated data is deleted from the knowledge base. The aging time is predefined according to the actual situation, and after the aging time is up, the data in the database is considered to have no referential property, and even the supplemented and completed network flow data is possibly wrong, so that the aged data in the knowledge base is deleted in time.
The advantage of this setting is that for the reliability of knowledge base provides the guarantee, prevent to obtain the wrong information of failure from knowledge base.
EXAMPLE III
Fig. 3 is another flowchart of the data processing method provided in the third embodiment of the present invention, and the third embodiment further improves the technical solution of the first embodiment. As shown in fig. 3, a data processing method includes the steps of:
s310, obtaining current network flow data generated in the network using process of the user.
And S320, acquiring an information processing rule.
S330, when the target user information of the current network flow data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension.
S340, backfilling the target user information into the current network flow data correspondingly.
S350, determining a target extraction field corresponding to the business operation type according to the business operation type in the backfilled current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields.
The target user information is backfilled into the current network flow data, so that the current network flow data is perfected, namely the data processing device can further mine the perfected network flow data subsequently. The data processing apparatus may preset a rule mapping relationship for indicating a mapping relationship between the service operation type and the extraction field. The information concerned by different service operation types is different, so that corresponding extraction fields can be set for different service operation types, and only the information under the extraction fields concerned by the service operation types is extracted from the current network traffic data. For example, when the service operation type is a video transmission through WeChat, the extraction field corresponding to the service operation type may include time, place, occupied base station, mobile phone number, micro signal, IMSI, etc. of the transmitter, and micro signal of the receiver, etc. of the video transmission.
Therefore, the data processing device can read the service operation type of the user from the current network traffic data, and determine the target extraction field corresponding to the service operation type from the preset rule mapping relation.
And S360, according to the target extraction field, performing structured extraction on the backfilled current network flow data to form new network flow data.
And the data processing device extracts the information under the target extraction field from the current network traffic data according to the target extraction field, so as to form new network traffic data. For example, taking the example in S350 as an example, information in fields such as time, place, occupied base station, mobile phone number, micro signal, and IMSI of the sender, and information in fields such as micro signal of the receiver are extracted from the current network traffic data, and the extraction result is used as new network traffic data for subsequent mining processing.
According to the technical scheme, the backfilled current network traffic data is structurally extracted by using the rule mapping relation, so that the new network traffic data is reduced in comparison with the original data volume, the problem that the large data volume processing and the slow resource consumption are solved, and the effect of improving the subsequent processing efficiency of the network traffic data is achieved.
After new network traffic data is formed, on the basis of the above embodiment, optionally, the new network traffic data is stored, and the backfilled current network traffic data is filtered. Due to the limited storage space in the data processing device, the current network traffic data before structured extraction may be deleted after storing the new network traffic data.
The advantage of this arrangement is that duplicate or useless network traffic data is filtered out, thereby saving storage space.
Example four
Fig. 4 is a schematic structural diagram of a data processing apparatus according to a fourth embodiment of the present invention, which can execute a data processing method according to the foregoing embodiments. Referring to fig. 4, the apparatus includes:
a first obtaining module 410, configured to obtain current network traffic data generated in a network using process of a user;
a second obtaining module 420, which obtains the information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
a determining module 430, configured to determine, when target user information of the current network traffic data in the target dimension is missing, the target user information from a knowledge base according to existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
the backfilling module 440 is configured to correspondingly backfill the target user information into the current network traffic data.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
On the basis of the above embodiment, the data processing apparatus may further include:
an information extraction module, configured to, when target user information of the current network traffic data in the target dimension is not missing, extract, according to the information processing rule, the target user information and associated user information of the target user information from an existing user information set in the current network traffic data;
and the data storage module is used for associating the target user information with the associated user information and storing the associated data into the knowledge base according to the information processing rule.
On the basis of the foregoing embodiments, the data processing apparatus may further include:
the aging timing module is used for setting corresponding data aging time for the associated data;
and the data deleting module is used for deleting the associated data from the knowledge base after the data aging time is overtime.
On the basis of the foregoing embodiment, the determining module 430 is further configured to determine whether the content of the current network traffic data in the target dimension is empty; if yes, determining that the target user information in the current network traffic data is missing; if not, determining that the target user information in the current network traffic data is not missing.
On the basis of the above embodiment, the data processing apparatus may further include:
the target extraction field determining module is used for determining a target extraction field corresponding to the business operation type according to the business operation type in the backfilled current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields;
and the structural extraction module is used for performing structural extraction on the backfilled current network flow data according to the target extraction field to form new network flow data.
On the basis of the above embodiment, the data processing apparatus may further include:
and the new network flow data storage module is used for storing the new network flow data and filtering the backfilled current network flow data.
On the basis of the foregoing embodiment, the first obtaining module 410 is specifically configured to obtain, from the card system, current network traffic data generated in a process of using a network by a user.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a data processing apparatus according to a fifth embodiment of the present invention, as shown in fig. 5, the apparatus includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 of the apparatus may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The memory 520 is a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to a data processing method in the embodiment of the present invention (for example, the first obtaining module 410, the second obtaining module 420, the determining module 430, and the backfilling module 440 in a data processing apparatus). The processor 510 implements a data processing method as described above by executing software programs, instructions, and modules stored in the memory 520 to execute various functional applications of the device and data processing. The method comprises the following steps:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 520 may further include memory located remotely from the processor 510, which may be connected to the device/terminal/server via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the apparatus. The output device 540 may include a display device such as a display screen.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
EXAMPLE six
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements steps of a data processing method, and the method includes:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in a data processing method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
It should be noted that, in the above embodiment of the data processing apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
Claims (10)
1. A data processing method, comprising:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
2. The method of claim 1, further comprising:
when the target user information of the current network traffic data in the target dimension is not lost, extracting the target user information and the associated user information of the target user information from the existing user information set in the current network traffic data according to the information processing rule;
and associating the target user information with the associated user information, and storing the associated data into the knowledge base according to the information processing rule.
3. The method of claim 2, further comprising:
setting corresponding data aging time for the associated data;
and deleting the associated data from the knowledge base after the data aging time is overtime.
4. The method of any of claims 1 to 3, wherein determining whether target user information for the current network traffic data in the target dimension is missing comprises:
judging whether the content of the current network traffic data in the target dimension is empty or not;
if yes, determining that the target user information in the current network traffic data is missing;
if not, determining that the target user information in the current network traffic data is not missing.
5. The method of any of claims 1 to 3, further comprising:
determining a target extraction field corresponding to the service operation type according to the backfilled service operation type in the current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields;
and according to the target extraction field, performing structured extraction on the backfilled current network flow data to form new network flow data.
6. The method of claim 5, further comprising:
and storing the new network flow data, and filtering the backfilled current network flow data.
7. The method according to any one of claims 1 to 3, wherein the obtaining current network traffic data generated during the network usage process of the user comprises:
and acquiring current network flow data generated in the network using process of the user from the card system.
8. A data processing apparatus, comprising:
the first acquisition module is used for acquiring current network flow data generated in the network using process of a user;
the second acquisition module acquires the information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
the determining module is used for determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension when the target user information of the current network traffic data in the target dimension is missing; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and the backfilling module is used for backfilling the target user information into the current network flow data correspondingly.
9. A data processing device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110829558.0A CN113535702B (en) | 2021-07-22 | 2021-07-22 | Data processing method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110829558.0A CN113535702B (en) | 2021-07-22 | 2021-07-22 | Data processing method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113535702A true CN113535702A (en) | 2021-10-22 |
CN113535702B CN113535702B (en) | 2024-03-26 |
Family
ID=78120433
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110829558.0A Active CN113535702B (en) | 2021-07-22 | 2021-07-22 | Data processing method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113535702B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7593351B1 (en) * | 2005-06-30 | 2009-09-22 | Opnet Technologies, Inc. | Method and system for collecting and consolidating network traffic information |
CN110134830A (en) * | 2019-04-15 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Video information data processing method, device, computer equipment and storage medium |
CN110287346A (en) * | 2019-06-28 | 2019-09-27 | 深圳云天励飞技术有限公司 | Date storage method, device, server and storage medium |
CN110784498A (en) * | 2018-07-31 | 2020-02-11 | 阿里巴巴集团控股有限公司 | Personalized data disaster tolerance method and device |
CN112532441A (en) * | 2020-11-24 | 2021-03-19 | 成都西加云杉科技有限公司 | Network diagnosis and repair method, device, equipment and medium |
CN112559538A (en) * | 2020-11-11 | 2021-03-26 | 中广核工程有限公司 | Incidence relation generation method and device, computer equipment and storage medium |
-
2021
- 2021-07-22 CN CN202110829558.0A patent/CN113535702B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7593351B1 (en) * | 2005-06-30 | 2009-09-22 | Opnet Technologies, Inc. | Method and system for collecting and consolidating network traffic information |
CN110784498A (en) * | 2018-07-31 | 2020-02-11 | 阿里巴巴集团控股有限公司 | Personalized data disaster tolerance method and device |
CN110134830A (en) * | 2019-04-15 | 2019-08-16 | 深圳壹账通智能科技有限公司 | Video information data processing method, device, computer equipment and storage medium |
CN110287346A (en) * | 2019-06-28 | 2019-09-27 | 深圳云天励飞技术有限公司 | Date storage method, device, server and storage medium |
CN112559538A (en) * | 2020-11-11 | 2021-03-26 | 中广核工程有限公司 | Incidence relation generation method and device, computer equipment and storage medium |
CN112532441A (en) * | 2020-11-24 | 2021-03-19 | 成都西加云杉科技有限公司 | Network diagnosis and repair method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN113535702B (en) | 2024-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113242236B (en) | Method for constructing network entity threat map | |
CN112714359B (en) | Video recommendation method and device, computer equipment and storage medium | |
CN111372209B (en) | Signaling data processing method, device, equipment and medium | |
CN109815214B (en) | Database access method, system, device and storage medium | |
CN107085549B (en) | Method and device for generating fault information | |
CN103309998A (en) | Message query method, message query device and terminal equipment | |
CN112434039A (en) | Data storage method, device, storage medium and electronic device | |
CN115048177B (en) | Dynamic configuration method for completing business scene based on custom container | |
CN111859127A (en) | Subscription method and device of consumption data and storage medium | |
CN113051460A (en) | Elasticissearch-based data retrieval method and system, electronic device and storage medium | |
CN110633318A (en) | Data extraction processing method, device, equipment and storage medium | |
CN111641554B (en) | Message processing method and device and computer readable storage medium | |
CN113535702B (en) | Data processing method, device, equipment and storage medium | |
CN113783855B (en) | Site evaluation method, apparatus, electronic device, storage medium, and program product | |
CN103577451A (en) | Webpage transcoding method, webpage transcoding device and webpage transcoding system | |
CN113965408B (en) | Method, device, medium and equipment for extracting HTTP (hyper text transport protocol) message | |
CN111510940B (en) | Signaling analysis method and device | |
CN112764988A (en) | Data segmentation acquisition method and device | |
CN104009970A (en) | Network information acquisition method | |
CN113572854B (en) | Data transmission method and system based on Kafka component | |
CN115757049B (en) | Multi-service module log recording method, system, electronic equipment and storage medium | |
CN114466075B (en) | Request processing method and device, electronic equipment and storage medium | |
CN106998359A (en) | The method for network access and device of speech-recognition services based on artificial intelligence | |
CN108614822A (en) | A kind of intelligence event storage, read method and device | |
CN117435556A (en) | Method, device, equipment and medium for inquiring call ticket data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |