CN113535702A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN113535702A
CN113535702A CN202110829558.0A CN202110829558A CN113535702A CN 113535702 A CN113535702 A CN 113535702A CN 202110829558 A CN202110829558 A CN 202110829558A CN 113535702 A CN113535702 A CN 113535702A
Authority
CN
China
Prior art keywords
user information
target
current network
data
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110829558.0A
Other languages
Chinese (zh)
Other versions
CN113535702B (en
Inventor
任亮
熊伟
程强
万月亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN202110829558.0A priority Critical patent/CN113535702B/en
Publication of CN113535702A publication Critical patent/CN113535702A/en
Application granted granted Critical
Publication of CN113535702B publication Critical patent/CN113535702B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a data processing device, data processing equipment and a storage medium. The method comprises the following steps: acquiring current network flow data generated in the network using process of a user; acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing the current network traffic data and an associated dimension of the target dimension; when the target user information of the current network flow data in the target dimension is lost, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension; the knowledge base stores user information of the same user in a target dimension and an associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time; and backfilling the target user information into the current network flow data correspondingly. The invention backfills the missing target user information in the network flow data through the learning accumulated knowledge base, thereby achieving the effect of making the network flow data more complete.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the continuous development of network application communication technology, the requirement for mining network traffic data is more and more common. In the prior art, mining processing of network traffic data is mainly realized by performing standardized data processing on the network traffic data and filtering problem data.
In the process of implementing the invention, the inventor finds that the prior art has the following defects: under the condition that the original network traffic data is incomplete, the network traffic data obtained only through standardization processing may have a problem of low quality, so that data support cannot be provided for a subsequent service system.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, an apparatus, a device, and a storage medium, so as to solve a problem that network traffic data obtained only through standardized processing may have low quality under a condition that original network traffic data is incomplete by setting an information processing rule and determining target user information in a knowledge base according to the information processing rule.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, including:
the acquisition module is used for acquiring current network flow data generated in the network using process of a user;
the first acquisition module acquires an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
the determining module is used for determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension when the target user information of the current network traffic data in the target dimension is missing; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and the backfilling module is used for backfilling the target user information into the current network flow data correspondingly.
In a third aspect, an embodiment of the present invention further provides a data processing apparatus, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the method in any one of the embodiments when executing the computer program.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method in any one of the embodiments.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
Drawings
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 2 is another flowchart of a data processing method according to a second embodiment of the present invention;
fig. 3 is another flowchart of a data processing method according to a third embodiment of the present invention;
FIG. 4 is a schematic diagram of a data processing apparatus according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data processing apparatus according to a fifth embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Example one
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, where this embodiment is applicable to a situation where mining processing is performed on network traffic data with data loss, and the method may be executed by a data processing apparatus, and specifically includes the following steps:
and S110, acquiring current network flow data generated in the network using process of the user.
In the process of using the internet, the user terminal continuously generates the original internet surfing data through interaction with the server, and the network flow data can be obtained through analyzing the original internet surfing data. The network traffic data may include various information of the user accessing the internet, for example, the various information may include generation time of the piece of network traffic data, user information of a user to which the piece of network traffic data belongs, and internet access operation content of the user (for example, a piece of video is sent in the wechat group). Taking the user using the WeChat as an example, the user information may include a user terminal ID, a micro signal of the user, a location where the user accesses the internet, a mobile phone number of the user, a user name, and related network operator information. Meanwhile, the current network traffic data is the current network traffic data to be processed.
The invention stores the network flow data by using the card, the card is a system for recording or storing information, the card can process the data in batch, and the card has the characteristic of high access speed. Therefore, the network traffic data analyzed in real time can be accessed into the card system, and when the network traffic data needs to be processed, the current network traffic data generated in the network using process of the user is obtained from the card system.
And S120, acquiring an information processing rule.
The information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension. The target dimension may be understood as a dimension in which user information that needs to be focused on is located, and the associated dimension is a dimension in which user information that is associated with user information that needs to be focused on is located. Generally, the user information of the network traffic data in the target dimension may be lost, but the user information of the network traffic data in the associated dimension is not easy to be lost, so that it may be considered to determine the user information lost in the target dimension by using the user information of the network traffic data in the associated dimension. For example, taking the information processing rule as the mobile phone number/micro signal of the user as an example, it is assumed that the mobile phone number of the user is relatively concerned by the subsequent service analysis, and meanwhile, in a general case, it is considered that the micro signal of the user in the network traffic data is not easy to be lost, so that the mobile phone number of the user can be set as a target dimension, the micro signal of the user is set as an associated dimension, and the mobile phone number of the user is determined by using the micro signal of the user. It should be noted that, in the embodiment of the present application, only the information processing rule is taken as the Mobile phone Number/micro signal of the user as an example, in practical application, the corresponding information processing rule may be set in combination with actual service requirements, for example, the set information processing rule may be the micro signal/user terminal ID (for example, International Mobile Subscriber identity Number (IMSI)) of the user.
In this way, when the current network traffic data is processed, the information processing rule for processing the current network traffic data can be acquired from the metadata corresponding to the current network traffic data, so that the target dimension and the associated dimension are acquired.
S130, when the target user information of the current network flow data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension.
Optionally, the process of determining whether the target user information of the current network traffic data in the target dimension is missing may be: judging whether the content of the current network traffic data in the target dimension is empty or not; if yes, determining that the target user information in the current network traffic data is missing; if not, determining that the target user information in the current network traffic data is not missing.
The knowledge base stores user information of the same user in the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time.
In a large amount of network traffic data, for the same user, the quality of part of the network traffic data may be higher, and the quality of part of the network traffic data is lower, so that the missing user information in the network traffic data with lower quality can be filled by referring to the network traffic data with higher quality. Therefore, the user information of the same user in the target dimension and the associated dimension can be learned from the network traffic data by learning the network traffic data with higher quality, so that the user information of a plurality of users in the target dimension and the associated dimension can be stored in the knowledge base after a period of learning accumulation.
Therefore, when the target user information of the current network traffic data in the target dimension is missing, the existing user information can be searched from the knowledge base according to the existing user information of the current network traffic data in the associated dimension, and the user information associated with the existing user information is determined as the target user information of the current network traffic data in the target dimension. Taking a target dimension as a mobile phone number of a user and an associated dimension as a micro signal of the user as an example, meanwhile, assuming that the micro signal in the current network traffic data is 123, when it is determined that target user information of the current network traffic data under the dimension "mobile phone number" is missing, then, based on the micro signal "123", user information associated with the micro signal "123" can be searched from a knowledge base, and the user information is determined as the target user information of the current network traffic data under the dimension "mobile phone number".
And S140, correspondingly backfilling the target user information into the current network traffic data.
The target user information obtained from the knowledge base in S130 is backfilled into the current network traffic data. Continuing with the example in S130 as an example, after determining the target user information of the current network traffic data under the dimension "mobile phone number", the target user information may be refilled into the current network traffic data, so that the current network traffic data does not lack the target user information any more.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
Example two
Fig. 2 is another flowchart of a data processing method provided in the second embodiment of the present invention, and the second embodiment further improves the technical solution of the first embodiment. As shown in fig. 2, a data processing method includes the steps of:
s210, obtaining current network flow data generated in the network using process of the user.
And S220, acquiring an information processing rule.
S230, when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension.
And S240, correspondingly backfilling the target user information into the current network traffic data.
S250, when the target user information of the current network traffic data in the target dimension is not lost, extracting the target user information and the associated user information of the target user information from the existing user information set in the current network traffic data according to the information processing rule.
Taking a mobile phone number with a target dimension as a user and a micro signal with a related dimension as the user as an example, and simultaneously assuming that the micro signal in the current network traffic data is 123, when it is determined that the target user information of the current network traffic data under the dimension 'mobile phone number' is complete, the target user information under the dimension 'mobile phone number' and the existing user information under the dimension 'micro signal' are extracted from the current network traffic information, and the existing user information under the dimension 'micro signal' is the related user information of the target user information.
S260, associating the target user information with the associated user information, and storing the associated data into the knowledge base according to the information processing rule.
And after the target user information and the associated user information are obtained, the target user information and the associated user information are stored in an associated mode according to the information processing rule. Continuing to take the example in S250 as an example, the target user information under the dimension "mobile phone number" and the existing user information under the dimension "micro signal" are stored in the knowledge base in an associated manner, so that the target user information in the subsequent network traffic data is convenient to backfill by using the information in the knowledge base when the target user information is missing. In that
According to the technical scheme of the embodiment, when the target user information of the current network flow data in the target dimension is not lost, the target user information and the associated user information of the target user information in the current network flow data are extracted by using the information processing rule, the extracted target user information and the associated user information are stored in the knowledge base in an associated mode, the user information of a plurality of users in the target dimension and the associated dimension is accumulated in the knowledge base through real-time learning of the network flow data with high quality, and backfill preparation is made for the subsequent network flow data with low quality.
Optionally, a corresponding data aging time may be set for the associated data, and after the data aging time is overtime, the associated data is deleted from the knowledge base. The aging time is predefined according to the actual situation, and after the aging time is up, the data in the database is considered to have no referential property, and even the supplemented and completed network flow data is possibly wrong, so that the aged data in the knowledge base is deleted in time.
The advantage of this setting is that for the reliability of knowledge base provides the guarantee, prevent to obtain the wrong information of failure from knowledge base.
EXAMPLE III
Fig. 3 is another flowchart of the data processing method provided in the third embodiment of the present invention, and the third embodiment further improves the technical solution of the first embodiment. As shown in fig. 3, a data processing method includes the steps of:
s310, obtaining current network flow data generated in the network using process of the user.
And S320, acquiring an information processing rule.
S330, when the target user information of the current network flow data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network flow data in the associated dimension.
S340, backfilling the target user information into the current network flow data correspondingly.
S350, determining a target extraction field corresponding to the business operation type according to the business operation type in the backfilled current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields.
The target user information is backfilled into the current network flow data, so that the current network flow data is perfected, namely the data processing device can further mine the perfected network flow data subsequently. The data processing apparatus may preset a rule mapping relationship for indicating a mapping relationship between the service operation type and the extraction field. The information concerned by different service operation types is different, so that corresponding extraction fields can be set for different service operation types, and only the information under the extraction fields concerned by the service operation types is extracted from the current network traffic data. For example, when the service operation type is a video transmission through WeChat, the extraction field corresponding to the service operation type may include time, place, occupied base station, mobile phone number, micro signal, IMSI, etc. of the transmitter, and micro signal of the receiver, etc. of the video transmission.
Therefore, the data processing device can read the service operation type of the user from the current network traffic data, and determine the target extraction field corresponding to the service operation type from the preset rule mapping relation.
And S360, according to the target extraction field, performing structured extraction on the backfilled current network flow data to form new network flow data.
And the data processing device extracts the information under the target extraction field from the current network traffic data according to the target extraction field, so as to form new network traffic data. For example, taking the example in S350 as an example, information in fields such as time, place, occupied base station, mobile phone number, micro signal, and IMSI of the sender, and information in fields such as micro signal of the receiver are extracted from the current network traffic data, and the extraction result is used as new network traffic data for subsequent mining processing.
According to the technical scheme, the backfilled current network traffic data is structurally extracted by using the rule mapping relation, so that the new network traffic data is reduced in comparison with the original data volume, the problem that the large data volume processing and the slow resource consumption are solved, and the effect of improving the subsequent processing efficiency of the network traffic data is achieved.
After new network traffic data is formed, on the basis of the above embodiment, optionally, the new network traffic data is stored, and the backfilled current network traffic data is filtered. Due to the limited storage space in the data processing device, the current network traffic data before structured extraction may be deleted after storing the new network traffic data.
The advantage of this arrangement is that duplicate or useless network traffic data is filtered out, thereby saving storage space.
Example four
Fig. 4 is a schematic structural diagram of a data processing apparatus according to a fourth embodiment of the present invention, which can execute a data processing method according to the foregoing embodiments. Referring to fig. 4, the apparatus includes:
a first obtaining module 410, configured to obtain current network traffic data generated in a network using process of a user;
a second obtaining module 420, which obtains the information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
a determining module 430, configured to determine, when target user information of the current network traffic data in the target dimension is missing, the target user information from a knowledge base according to existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
the backfilling module 440 is configured to correspondingly backfill the target user information into the current network traffic data.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
On the basis of the above embodiment, the data processing apparatus may further include:
an information extraction module, configured to, when target user information of the current network traffic data in the target dimension is not missing, extract, according to the information processing rule, the target user information and associated user information of the target user information from an existing user information set in the current network traffic data;
and the data storage module is used for associating the target user information with the associated user information and storing the associated data into the knowledge base according to the information processing rule.
On the basis of the foregoing embodiments, the data processing apparatus may further include:
the aging timing module is used for setting corresponding data aging time for the associated data;
and the data deleting module is used for deleting the associated data from the knowledge base after the data aging time is overtime.
On the basis of the foregoing embodiment, the determining module 430 is further configured to determine whether the content of the current network traffic data in the target dimension is empty; if yes, determining that the target user information in the current network traffic data is missing; if not, determining that the target user information in the current network traffic data is not missing.
On the basis of the above embodiment, the data processing apparatus may further include:
the target extraction field determining module is used for determining a target extraction field corresponding to the business operation type according to the business operation type in the backfilled current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields;
and the structural extraction module is used for performing structural extraction on the backfilled current network flow data according to the target extraction field to form new network flow data.
On the basis of the above embodiment, the data processing apparatus may further include:
and the new network flow data storage module is used for storing the new network flow data and filtering the backfilled current network flow data.
On the basis of the foregoing embodiment, the first obtaining module 410 is specifically configured to obtain, from the card system, current network traffic data generated in a process of using a network by a user.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a data processing apparatus according to a fifth embodiment of the present invention, as shown in fig. 5, the apparatus includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 of the apparatus may be connected by a bus or other means, as exemplified by the bus connection in fig. 5.
The memory 520 is a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to a data processing method in the embodiment of the present invention (for example, the first obtaining module 410, the second obtaining module 420, the determining module 430, and the backfilling module 440 in a data processing apparatus). The processor 510 implements a data processing method as described above by executing software programs, instructions, and modules stored in the memory 520 to execute various functional applications of the device and data processing. The method comprises the following steps:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 520 may further include memory located remotely from the processor 510, which may be connected to the device/terminal/server via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the apparatus. The output device 540 may include a display device such as a display screen.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
EXAMPLE six
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements steps of a data processing method, and the method includes:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in a data processing method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
According to the technical scheme of the embodiment, when the target user information of the current network traffic data in the target dimension is missing, the missing target user information in the current network traffic data can be filled back and forth through the existing user information of the current network traffic data in the relevant dimension of the target dimension and the knowledge base obtained by learning and accumulating the network traffic data generated in the network using process of the user in real time, so that the current network traffic data is more complete, and the quality of the network traffic data is improved.
It should be noted that, in the above embodiment of the data processing apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A data processing method, comprising:
acquiring current network flow data generated in the network using process of a user;
acquiring an information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
when the target user information of the current network traffic data in the target dimension is missing, determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and correspondingly backfilling the target user information into the current network flow data.
2. The method of claim 1, further comprising:
when the target user information of the current network traffic data in the target dimension is not lost, extracting the target user information and the associated user information of the target user information from the existing user information set in the current network traffic data according to the information processing rule;
and associating the target user information with the associated user information, and storing the associated data into the knowledge base according to the information processing rule.
3. The method of claim 2, further comprising:
setting corresponding data aging time for the associated data;
and deleting the associated data from the knowledge base after the data aging time is overtime.
4. The method of any of claims 1 to 3, wherein determining whether target user information for the current network traffic data in the target dimension is missing comprises:
judging whether the content of the current network traffic data in the target dimension is empty or not;
if yes, determining that the target user information in the current network traffic data is missing;
if not, determining that the target user information in the current network traffic data is not missing.
5. The method of any of claims 1 to 3, further comprising:
determining a target extraction field corresponding to the service operation type according to the backfilled service operation type in the current network flow data and a preset rule mapping relation; the rule mapping relation comprises mapping relations between different business operation types and extraction fields;
and according to the target extraction field, performing structured extraction on the backfilled current network flow data to form new network flow data.
6. The method of claim 5, further comprising:
and storing the new network flow data, and filtering the backfilled current network flow data.
7. The method according to any one of claims 1 to 3, wherein the obtaining current network traffic data generated during the network usage process of the user comprises:
and acquiring current network flow data generated in the network using process of the user from the card system.
8. A data processing apparatus, comprising:
the first acquisition module is used for acquiring current network flow data generated in the network using process of a user;
the second acquisition module acquires the information processing rule; the information processing rule is used for indicating a target dimension for processing current network traffic data and an associated dimension of the target dimension;
the determining module is used for determining the target user information from a knowledge base according to the existing user information of the current network traffic data in the associated dimension when the target user information of the current network traffic data in the target dimension is missing; the knowledge base stores user information of the same user under the target dimension and the associated dimension, and the user information is obtained by learning network flow data generated in the network using process of the user in real time;
and the backfilling module is used for backfilling the target user information into the current network flow data correspondingly.
9. A data processing device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202110829558.0A 2021-07-22 2021-07-22 Data processing method, device, equipment and storage medium Active CN113535702B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110829558.0A CN113535702B (en) 2021-07-22 2021-07-22 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110829558.0A CN113535702B (en) 2021-07-22 2021-07-22 Data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113535702A true CN113535702A (en) 2021-10-22
CN113535702B CN113535702B (en) 2024-03-26

Family

ID=78120433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110829558.0A Active CN113535702B (en) 2021-07-22 2021-07-22 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113535702B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7593351B1 (en) * 2005-06-30 2009-09-22 Opnet Technologies, Inc. Method and system for collecting and consolidating network traffic information
CN110134830A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Video information data processing method, device, computer equipment and storage medium
CN110287346A (en) * 2019-06-28 2019-09-27 深圳云天励飞技术有限公司 Date storage method, device, server and storage medium
CN110784498A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Personalized data disaster tolerance method and device
CN112532441A (en) * 2020-11-24 2021-03-19 成都西加云杉科技有限公司 Network diagnosis and repair method, device, equipment and medium
CN112559538A (en) * 2020-11-11 2021-03-26 中广核工程有限公司 Incidence relation generation method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7593351B1 (en) * 2005-06-30 2009-09-22 Opnet Technologies, Inc. Method and system for collecting and consolidating network traffic information
CN110784498A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Personalized data disaster tolerance method and device
CN110134830A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Video information data processing method, device, computer equipment and storage medium
CN110287346A (en) * 2019-06-28 2019-09-27 深圳云天励飞技术有限公司 Date storage method, device, server and storage medium
CN112559538A (en) * 2020-11-11 2021-03-26 中广核工程有限公司 Incidence relation generation method and device, computer equipment and storage medium
CN112532441A (en) * 2020-11-24 2021-03-19 成都西加云杉科技有限公司 Network diagnosis and repair method, device, equipment and medium

Also Published As

Publication number Publication date
CN113535702B (en) 2024-03-26

Similar Documents

Publication Publication Date Title
CN113242236B (en) Method for constructing network entity threat map
CN112714359B (en) Video recommendation method and device, computer equipment and storage medium
CN111372209B (en) Signaling data processing method, device, equipment and medium
CN109815214B (en) Database access method, system, device and storage medium
CN107085549B (en) Method and device for generating fault information
CN103309998A (en) Message query method, message query device and terminal equipment
CN112434039A (en) Data storage method, device, storage medium and electronic device
CN115048177B (en) Dynamic configuration method for completing business scene based on custom container
CN111859127A (en) Subscription method and device of consumption data and storage medium
CN113051460A (en) Elasticissearch-based data retrieval method and system, electronic device and storage medium
CN110633318A (en) Data extraction processing method, device, equipment and storage medium
CN111641554B (en) Message processing method and device and computer readable storage medium
CN113535702B (en) Data processing method, device, equipment and storage medium
CN113783855B (en) Site evaluation method, apparatus, electronic device, storage medium, and program product
CN103577451A (en) Webpage transcoding method, webpage transcoding device and webpage transcoding system
CN113965408B (en) Method, device, medium and equipment for extracting HTTP (hyper text transport protocol) message
CN111510940B (en) Signaling analysis method and device
CN112764988A (en) Data segmentation acquisition method and device
CN104009970A (en) Network information acquisition method
CN113572854B (en) Data transmission method and system based on Kafka component
CN115757049B (en) Multi-service module log recording method, system, electronic equipment and storage medium
CN114466075B (en) Request processing method and device, electronic equipment and storage medium
CN106998359A (en) The method for network access and device of speech-recognition services based on artificial intelligence
CN108614822A (en) A kind of intelligence event storage, read method and device
CN117435556A (en) Method, device, equipment and medium for inquiring call ticket data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant