CN113569938A - Service big data acquisition method and server based on time-space domain characteristics - Google Patents

Service big data acquisition method and server based on time-space domain characteristics Download PDF

Info

Publication number
CN113569938A
CN113569938A CN202110833263.0A CN202110833263A CN113569938A CN 113569938 A CN113569938 A CN 113569938A CN 202110833263 A CN202110833263 A CN 202110833263A CN 113569938 A CN113569938 A CN 113569938A
Authority
CN
China
Prior art keywords
data
service
topological
distribution
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110833263.0A
Other languages
Chinese (zh)
Inventor
廖彩红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202110833263.0A priority Critical patent/CN113569938A/en
Publication of CN113569938A publication Critical patent/CN113569938A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a service big data acquisition method and a server based on time-space domain characteristics, wherein a plurality of time domain node service data and a plurality of space domain node service data in a service data acquisition range, time domain correlation parameters and time domain characteristic differences among the plurality of time domain node service data and space domain correlation parameters and space domain characteristic differences among the plurality of space domain node service data are obtained to combine the plurality of time domain node service data to obtain time domain data topological distribution in the service data acquisition range; and then, combining the plurality of airspace node service data according to the airspace related parameters and the airspace characteristic difference to obtain the airspace data topological distribution in the service data acquisition range. Therefore, the service data can be acquired through two data dimensions of a time domain and a space domain, and the correlation of the two dimensions enables the service data flow and the related information obtained by data acquisition to more accurately reflect the related data characteristics of the service data.

Description

Service big data acquisition method and server based on time-space domain characteristics
The application is a divisional application of an invention patent application with the application number of 202110077131.X, the application date of 2021, 20.01.1 and the name of the invention of a method and a server for acquiring business big data for artificial intelligence machine learning.
Technical Field
The invention relates to the technical field of artificial intelligence and big data, in particular to a business big data acquisition method and a server for artificial intelligence machine learning.
Background
Artificial Intelligence (AI) is a discipline that simulates the process of human consciousness and thinking. Machine Learning (Machine Learning) and Deep Learning (Deep Learning) have also made a significant breakthrough as the core of artificial intelligence, and machines are endowed with powerful cognitive and predictive capabilities. Machine learning and deep learning need to use a large amount of collected sample data to carry out model training, so as to be beneficial to later-stage application. However, it has been found through research that in some fields, such as the internet, digital finance, e-commerce platform, etc., during the data acquisition process, the model training may fail or the model after training may not be good in later application effect due to factors such as insufficient accuracy of data acquisition or acquisition errors of data characteristics.
Disclosure of Invention
In order to solve the above problem, an embodiment of the present invention provides a method for collecting big business data for artificial intelligence machine learning, including:
respectively acquiring node data of a time domain node and a space domain node aiming at service data generated in a preset service data acquisition range for acquiring big service data to obtain time-space domain node data; the time-space domain node data comprises time domain node data and space domain node data corresponding to the service data generated in the service data acquisition range;
and performing topology fusion analysis on the time-space domain node data to obtain a plurality of service data streams in the service data acquisition range and service characteristic information corresponding to the service data streams, and obtaining a service data stream sample set according to the service data streams and the corresponding service characteristic information to be used as a service data learning sample for artificial intelligence machine learning to perform machine learning.
In the embodiment of the present invention, the performing topology fusion analysis on the time-space domain node data to obtain a plurality of service data streams in the service data acquisition range and service feature information corresponding to the service data streams includes:
respectively forming a plurality of time domain data topological distributions and a plurality of space domain data topological distributions on the time domain node data and the space domain node data in the time-space domain node data;
performing topological fusion on each time domain data topological distribution and each space domain data topological distribution generated in the service data acquisition range according to a service topological relation between the time domain data topological distribution and the space domain data topological distribution to obtain a plurality of topological distribution fusion groups; the spatial domain data topological distribution in each topological distribution fusion group respectively comprises second spatial domain node service data in the service data acquisition range;
determining the spatial domain data topological distribution which is not subjected to topological fusion as to-be-processed spatial domain data topological distribution, and acquiring first topological distribution description information of the to-be-processed spatial domain data topological distribution according to first spatial domain node service data contained in the to-be-processed spatial domain data topological distribution; the first airspace node service data is contained in the service data acquisition range;
respectively acquiring second topological distribution description information of the spatial domain data topological distribution in each topological distribution fusion group according to second spatial domain node service data included in each topological distribution fusion group;
acquiring characteristic differences between the first topological distribution description information and second topological distribution description information corresponding to each topological distribution fusion group;
determining topological correlation parameters between the spatial domain data topological distribution in each topological distribution fusion group and the spatial domain data topological distribution to be processed respectively according to the characteristic difference corresponding to each topological distribution fusion group;
counting a target topological distribution fusion group with topological correlation parameters not less than a preset correlation parameter threshold, and determining service characteristic information contained in time domain data topological distribution in the target topological distribution fusion group as service characteristic information associated with the spatial domain data topological distribution to be processed;
performing topological fusion on the service characteristic information associated with the spatial domain data topological distribution to be processed and the spatial domain data topological distribution to be processed to obtain a characteristic topological fusion group;
and determining the service data stream in the service data acquisition range and the service characteristic information corresponding to the service data stream according to the characteristic topology fusion group and the plurality of topology distribution fusion groups.
In addition, the embodiment of the present invention further provides a server, which includes a processor and a machine-readable storage medium, where the machine-readable storage medium is connected to the processor, and is used to store a program, an instruction, or a code, and the processor is used to execute the program, the instruction, or the code in the machine-readable storage medium, so as to implement the above-mentioned business big data collection method for artificial intelligence machine learning.
Compared with the prior art, the service big data acquisition method and the server for artificial intelligence machine learning provided by the embodiment of the invention respectively acquire the node data of the time domain node and the space domain node aiming at the service data generated in the preset service data acquisition range for acquiring the service big data to obtain the time-space domain node data, and then perform topology fusion analysis on the time-space domain node data to obtain a plurality of service data streams in the service data acquisition range and service characteristic information corresponding to the service data streams, so as to be used as the service data learning sample for artificial intelligence machine learning to perform machine learning. Therefore, business data can be acquired through two data dimensions of time and space domains, and the business data flow and relevant information acquired through data acquisition can reflect the relevant data characteristics of the business data more accurately through the correlation of the two dimensions, so that the learning effect and the application effect of machine learning or artificial intelligence model training in the later period can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic structural diagram of a network architecture of a big data acquisition system according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a server according to an embodiment of the present invention.
Fig. 3 is a schematic flow chart of a business big data collection method for artificial intelligence machine learning according to an embodiment of the present invention.
Fig. 4 is a flowchart illustrating the sub-steps of step S32 in fig. 3.
Detailed Description
The technical solutions in the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, are included in the scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a network architecture of a big data acquisition system according to an embodiment of the present invention. As shown in fig. 1, the big data collecting system may include a server 100 and a service data terminal cluster, and the service data terminal cluster may include a plurality of service data terminals 200. The server 100 is in communication connection with the service data terminals 200, and is configured to collect service data generated by each service data terminal 200 from the service data terminal 200, so as to realize collection of big data. The number of the service data terminals 200 is not limited, and each service data terminal 200 may be in communication connection with the server 100 to facilitate data interaction with the server 100.
The server 100 shown in fig. 1 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services such as cloud services, cloud storage, cloud computing, cloud communication, cloud security services, and big data and artificial intelligence platforms. The service data terminal may be a smart phone, a tablet computer, a notebook computer, a personal computer, or the like, which can generate corresponding service data using the service provided by the server 100 or other third party platform.
Referring to fig. 2, fig. 2 is a schematic diagram of the server 100. In this embodiment, the server 100 is used to implement the service big data acquisition method for artificial intelligence machine learning provided by the embodiment of the present invention. In this embodiment, the server 100 may include a big data collecting device 110, a machine-readable storage medium 120, and a processor 130.
Alternatively, the machine-readable storage medium 120 and the processor 130 may be located in the server 100 and separately provided, or the machine-readable storage medium 120 and the processor 130 may be independent of the server 100. The machine-readable storage medium 120 may be accessed by the processor 130 through a bus interface. Alternatively, the machine-readable storage medium 120 may be integrated into the processor 130, e.g., may be a cache and/or general purpose registers.
The processor 130 is a control center of the server 100, connects various parts of the entire server 100 using various interfaces and lines, performs various functions of the server 100 and processes data by running or executing software programs and/or modules stored in the machine-readable storage medium 120 and calling data stored in the machine-readable storage medium 120, thereby performing overall monitoring of the server 100. Optionally, processor 130 may include one or more processing cores. For example, the processor 130 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor.
The processor 130 may be a general-purpose Central Processing Unit (CPU), a microprocessor, an Application-Specific Integrated Circuit (ASIC), or the like.
The machine-readable storage medium 120 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an Electrically Erasable programmable Read-Only MEMory (EEPROM), a compact disc Read-Only MEMory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The machine-readable storage medium 120 may be self-contained and coupled to the processor 130 via a communication bus. The machine-readable storage medium 120 may also be integrated with the processor. The machine-readable storage medium 120 is used for storing machine-executable instructions for performing aspects of the present application. The processor 130 is configured to execute machine-executable instructions stored in the machine-readable storage medium 120 to implement the big data collecting method provided by the present invention.
Referring to fig. 3, fig. 3 is a schematic flowchart of a business big data collection method for artificial intelligence machine learning according to an embodiment of the present invention, where the business big data collection method for artificial intelligence machine learning may be executed by the server 100. It should be understood that, in the method described in the embodiment of the present invention, the order of some steps may be interchanged according to actual needs, or some steps may be omitted or deleted, and the following describes implementation steps of the method for collecting business big data for artificial intelligence machine learning.
Step S31, respectively acquiring node data of a time domain node and a space domain node for service data generated in a preset service data acquisition range for acquiring service big data, to obtain time-space domain node data. In this embodiment, the time-space domain node data includes time domain node data and space domain node data generated in the service data acquisition range.
In this embodiment, it can be understood that the service data acquisition range may be a data acquisition range predetermined according to a big data acquisition task, for example, a geographical range for each service data terminal, and for example, service data generated by each service data terminal located in the preset geographical range all belong to the corresponding data acquisition range; or, the service data may also be preset service data belonging to a specific service type range, and is not limited specifically.
Secondly, each service data carrying time domain information can be positioned as time domain node data, and the time domain information can be, but is not limited to, time nodes generated by the service data, service execution flow time sequences, the sequence among the service flows and the time topological relation among the service flows. After the time domain node data are topologically associated through the corresponding time domain relationship association rule, a time domain topology network among the service data can be formed, and each node in the time domain topology network can be defined as a time domain node. For some time-sensitive service data, the service data acquired by the service data acquisition of the time domain nodes carries corresponding time domain information, and the time domain information of each service data can be considered when machine learning and model training are carried out in the later period, so that the result obtained by the machine learning or model training in the later period can be better applied. For example, in business data for analyzing user behaviors in the fields of internet finance, digital network and the like, when a user is portrayed, a factor that a behavior interest characteristic in each user behavior data is attenuated along with time needs to be considered, so that data acquisition needs to be performed on each data in a time domain dimension.
Accordingly, each service data carrying airspace information may be located as airspace node data, and the airspace information may be, but is not limited to, a spatial node (which may specifically correspond to location information of the data acquisition terminal, such as an IP address, an equipment identifier, and the like) generated by the service data, a geographic location range corresponding to the generated service data, a service range corresponding to the data service, and the like. And performing topological correlation on the node data of each airspace domain through a corresponding airspace relation correlation rule to form an airspace topological network among the service data, wherein each node in the airspace topological network can be defined as an airspace node. For some service data sensitive to space, the service data acquired by the service data acquisition through the airspace nodes carries corresponding airspace information, and the airspace information of each service data can be considered when machine learning and model training are carried out in the later period, so that the result obtained by the machine learning or model training in the later period can be better applied. For example, in the related field, it is necessary to consider the service data that analyzes the user behavior, such as the frequent activity range of the user corresponding to the service data, the general application scenario of each service type, and the user terminal corresponding to each service type, and when the user is portrayed, it is necessary to consider the correlation factor between the behavior interest feature in each user behavior data and the space, so that data acquisition in the airspace dimension is necessary for each data. Therefore, business data can be acquired through two data dimensions of time and space domains, and the business data flow and relevant information acquired through data acquisition can reflect the relevant data characteristics of the business data more accurately through the correlation of the two dimensions, so that the learning effect and the application effect of machine learning or artificial intelligence model training in the later period can be improved.
And step S32, performing topology fusion analysis on the time-space domain node data to obtain a plurality of service data streams in the service data acquisition range and service characteristic information corresponding to the service data streams, and obtaining a service data stream sample set according to the service data streams and the corresponding service characteristic information for machine learning as a service data learning sample for machine learning.
Please refer to fig. 4, which is a flowchart illustrating the sub-steps of step S32. In the step S32, the topology fusion analysis is performed on the time-space domain node data to obtain a plurality of service data flows in the service data acquisition range and service characteristic information corresponding to the service data flows, and a specific implementation scheme is described with reference to fig. 4.
And a substep S321 of forming a plurality of time domain data topological distributions and a plurality of space domain data topological distributions according to the time domain node data and the space domain node data in the time-space domain node data, respectively.
In this embodiment, the time domain node data included in the time-space domain node data may be topologically associated according to a preset time domain information association rule, so as to form a plurality of time domain data topological distributions. For example, the time domain information association rule may be set in advance according to the execution flow, generation timing, and the like of each service data. In a large amount of time domain node data carrying time domain information, according to a corresponding time domain information association rule, the time domain node data carrying different time domain information can be topologically associated to different distribution groups, and the different distribution groups have different topological distribution nodes to form a plurality of different time domain data topological distributions.
Correspondingly, the spatial domain node data included in the time-spatial domain node data can also be topologically associated according to a preset spatial domain information association rule to form a plurality of spatial domain data topological distributions. For example, the spatial domain information association rule may be set in advance according to the generation space node of each service data, the belonging service space range, the positional relationship between the ranges, and the like. According to the corresponding airspace information association rule, the airspace node data carrying different airspace information can be topologically associated to different distribution groups, and the different distribution groups have different topological distribution nodes to form a plurality of different airspace data topological distributions. The different topological distributions may be expressed by way of a topological graph, and are not particularly limited.
And a substep S322 of performing topology fusion on each time domain data topology distribution and each space domain data topology distribution generated in the service data acquisition range based on the service topology relationship between the time domain data topology distribution and the space domain data topology distribution to obtain a plurality of topology distribution fusion groups.
In this embodiment, the spatial domain data topology distribution in each topology distribution fusion group respectively includes the second spatial domain node service data in the service data acquisition range. The service topological relation may be a service association relation between service data corresponding to each node of time domain data topological distribution and spatial domain data topological distribution, for example, the service topological relation may be obtained according to a service type, user information, and user identity information corresponding to the service data of each node in the topological distribution. Therefore, the time domain data topological distribution and the space domain data topological distribution with the service topological relation can be fused to obtain the corresponding topological distribution fusion group. A topological distribution fusion set includes at least one time domain data topological distribution and at least one spatial domain data topological distribution.
In an alternative manner, the substep S322 may be implemented by:
firstly, determining each spatial domain data topological distribution generated in the service data acquisition range as local spatial domain topological distribution, and determining each time domain data topological distribution generated in the service data acquisition range as local time domain topological distribution; the airspace node service data in the local airspace topological distribution are obtained by carrying out data acquisition on a target service node in the service data acquisition range;
then, acquiring time domain node service data in the target service node; calculating a service data association parameter between time domain node service data in the target service node and each time domain node service data in the local time domain topological distribution, and determining a service topological relation between the local space domain topological distribution and the local time domain topological distribution according to the calculated service data association parameter;
and finally, when the business data correlation parameter is not less than a preset correlation parameter threshold value, performing topological fusion on the local spatial domain topological distribution and the local time domain topological distribution to obtain a plurality of topological distribution fusion groups. Therefore, the local time domain topological distribution can be respectively associated and matched with the time domain data generated by each service node (spatial domain feature), and topology fusion is performed if the association and matching are performed, so that a topology fusion group is generated.
And step S323, determining the spatial domain data topological distribution without topology fusion as the spatial domain data topological distribution to be processed, and acquiring first topological distribution description information of the spatial domain data topological distribution to be processed according to first spatial domain node service data contained in the spatial domain data topological distribution to be processed.
In this embodiment, time domain information may be lost or carried in the spatial domain node data generated by some abnormal data nodes, and the spatial domain data topology distribution generated by the spatial domain node data may not be matched with the corresponding time domain node topology distribution for topology fusion, so that the partial spatial domain data topology distribution is listed as the spatial domain data topology distribution to be processed for subsequent processing. For example, the airspace node service data included in the to-be-processed airspace data topological distribution may be referred to as first airspace node service data, and the to-be-processed airspace data topological distribution may include a plurality of first airspace node service data. Then, a service data recognition model can be obtained through pre-training, and the service data feature of each first airspace node service data is extracted, wherein the service data feature can be a service data description information. Then, the service data description information corresponding to each first airspace node service data can be combined to obtain the global service characteristic information corresponding to all the first airspace node service data. Finally, the global service characteristic information corresponding to the first airspace node service data may be referred to as first global service characteristic information. The first global service feature information is the topology distribution feature of the spatial domain data to be processed, so the first global service feature information may be referred to as first topology distribution description information of the spatial domain data to be processed.
And a substep S324, obtaining second topology distribution description information of the spatial domain data topology distribution in each topology distribution fusion group respectively according to the second spatial domain node service data included in each topology distribution fusion group.
In this embodiment, the manner of acquiring the second topology distribution description information may refer to the manner of acquiring the first topology distribution description information. For example, the plurality of topologically distributed fused groups may include a topologically distributed fused group Ri, i not greater than a total number of the plurality of topologically distributed fused groups; and the topology distribution fusion group Ri comprises a plurality of service data fragments of the second airspace node service data. Based on this, firstly, service data description information corresponding to each second airspace node service data in a plurality of second airspace node service data included in the topology distribution fusion group Ri can be obtained; then acquiring second global service characteristic information corresponding to the plurality of second airspace node service data according to the service data description information corresponding to each second airspace node service data; and finally, determining the second global service characteristic information as second topological distribution description information of the spatial domain data topological distribution in the topological distribution fusion group Ri.
And a substep S325, obtaining a feature difference between the first topology distribution description information and the second topology distribution description information corresponding to each topology distribution fusion group.
And S326, determining topological association parameters between the spatial domain data topological distribution in each topological distribution fusion group and the spatial domain data topological distribution to be processed respectively according to the characteristic difference corresponding to each topological distribution fusion group.
For example, in this embodiment, after obtaining the first topology distribution description information of the spatial domain data topology distribution to be processed and the second topology distribution description information of the spatial domain data topology distribution in each topology distribution fusion group, the feature difference between the first topology distribution description information and each second topology distribution description information may be obtained, and the topology association parameters between the spatial domain data topology distribution to be processed and the spatial domain data topology distribution in each topology distribution fusion group may be obtained through the feature difference corresponding to each topology distribution fusion group. For example, the larger the feature difference, the smaller the topology relation parameter, and the smaller the feature difference, the larger the topology relation parameter. Therefore, the reciprocal of the feature difference corresponding to each topology distribution fusion group can be used as the topology association parameter between the spatial domain data topology distribution to be processed and the spatial domain data topology distribution in each topology distribution fusion group, and the topology association parameter can be obtained by other methods besides obtaining the reciprocal according to the feature difference, which is not limited herein.
And a substep S327, counting a target topological distribution fusion group with topological correlation parameters not less than a preset correlation parameter threshold, and determining service characteristic information contained in time domain data topological distribution in the target topological distribution fusion group as service characteristic information associated with the spatial domain data topological distribution to be processed.
And a substep S328, performing topological fusion on the service characteristic information associated with the spatial domain data topological distribution to be processed and the spatial domain data topological distribution to be processed to obtain a characteristic topological fusion group corresponding to the spatial domain data topological distribution to be processed.
And a substep S329, determining the service data stream in the service data acquisition range and the service feature information corresponding to the service data stream according to the feature topology fusion group and the plurality of topology distribution fusion groups. In this embodiment, each service data included in one topology distribution fusion group or one topology distribution fusion group may be used as one corresponding service data stream, and each feature information related to a time-space domain included in each service data in the service data stream may be extracted as corresponding service feature information. Therefore, the related information of the service data in the to-be-processed airspace data topological distribution without topology fusion can be extracted, so that the data acquisition is more comprehensive and accurate.
Further, in the comprehensive functions of this embodiment, in the step S31, the node data acquisition of the time domain node and the space domain node is performed respectively on the service data generated in the preset service data acquisition range for performing the service big data acquisition, so as to obtain the time-space domain node data, and a specific implementation manner is described as follows:
firstly, acquiring a plurality of time domain node service data and a plurality of space domain node service data in the service data acquisition range;
then, acquiring time domain correlation parameters and time domain characteristic differences among the plurality of time domain node service data, and acquiring space domain correlation parameters and space domain characteristic differences among the plurality of space domain node service data;
then, combining the plurality of time domain node service data according to the time domain correlation parameters and the time domain feature difference to obtain time domain data topological distribution in the service data acquisition range; a time domain data topology distribution comprising at least one time domain node traffic data;
finally, combining the plurality of airspace node service data according to the airspace correlation parameters and the airspace feature difference to obtain the airspace data topological distribution in the service data acquisition range; a spatial domain data topology distribution includes at least one spatial domain node traffic data.
Based on the above, the time domain feature difference and the space domain feature difference may calculate corresponding time domain feature vectors and space domain feature vectors according to the service feature information of each service data, and then the time domain feature difference is represented by a vector distance between the time domain feature vectors, and the space domain feature difference is represented by a vector distance between the space domain feature vectors.
Next, in step S323, when the number of the spatial domain data topology distributions to be processed is multiple, the method of the present invention may further perform the following steps.
(a) And when the number of the target topological distribution fusion groups is not more than the first preset value, respectively determining the topological distribution fusion group where the airspace data topological distribution with the largest topological correlation parameter between the target topological distribution fusion groups and each airspace data topological distribution to be processed is located as a candidate topological fusion group corresponding to each airspace data topological distribution to be processed.
(b) And respectively determining the service characteristic information contained in the time domain data topological distribution in the candidate topological fusion group corresponding to each spatial domain data topological distribution to be processed as the candidate service characteristic information corresponding to each spatial domain data topological distribution to be processed.
(c) Determining a plurality of data characteristic descriptions corresponding to preset target service characteristic information according to the candidate service characteristic information corresponding to each to-be-processed airspace data topological distribution; and acquiring a first statistical result of the service characteristic information contained in the time domain data topological distribution of the plurality of data characteristic descriptions in the plurality of topological distribution fusion groups.
(d) And determining a first target data feature description of each to-be-processed airspace data topological distribution aiming at the preset target service feature information according to the first statistical result.
(e) Determining the preset target service characteristic information respectively having the first target data characteristic description corresponding to each to-be-processed airspace data topological distribution as service characteristic information associated with each to-be-processed airspace data topological distribution; the plurality of data characteristics describe a second statistical result in the traffic characteristic information associated with each to-be-processed airspace data topological distribution, and the second statistical result is equal to the first statistical result.
(f) When the number of the target topology distribution fusion groups is larger than the second preset value, counting the number of a plurality of data characteristic descriptions of preset target service characteristic information in service characteristic information contained in time domain node service data of the target topology distribution fusion groups; the data feature descriptions are determined according to service feature information contained in the time domain data topological distribution in the target topological distribution fusion group.
(g) And determining a second target data feature description of the airspace data topological distribution to be processed aiming at the preset target service feature information from the plurality of data feature descriptions according to the topological correlation parameters between the airspace data topological distribution to be processed and the target topological distribution fusion group and the number.
(f) And determining the preset target service characteristic information with the second target data characteristic description as service characteristic information associated with the spatial domain data topological distribution to be processed.
On the basis of the above contents, in order to facilitate the implementation of reading and writing of the acquired data information, the embodiments of the present invention may further include the following contents.
(1) Determining service characteristic information contained in time domain data topological distribution in the topological distribution fusion groups as service characteristic information contained in the topological distribution fusion groups, and determining each of the topological distribution fusion groups and the characteristic topological fusion group as a target topological fusion group in the service data acquisition range; and determining the service characteristic information contained in the target topology fusion group as target service characteristic information.
(2) And adding the same sequence number to the target service characteristic information and the spatial domain data topological distribution in the target topological fusion group.
(3) And writing the target service characteristic information with the sequence number into a first data area, a second data area and a third data area respectively. Wherein the data reading speed of the first data area is greater than the data reading speed of the second data area; the data reading speed of the second data area is greater than that of the third data area. The information writing amount of the first data area aiming at the target service characteristic information is smaller than that of the second data area aiming at the target service characteristic information; the information writing amount of the second data area aiming at the target service characteristic information is smaller than the information writing amount of the third data area aiming at the target service characteristic information. In this embodiment, the first data area may be a stack data structure, such as a FIFO queue, for example. The second data area and the third data area may be preset databases, such as a REDIS database and a Mysql database, respectively.
Based on the above, in the above sub-step S329, the service data stream in the service data acquisition range and the service feature information corresponding to the service data stream are determined according to the feature topology fusion group and the plurality of topology distribution fusion groups, and an achievable manner is as follows:
firstly, determining the service data flow in the service data acquisition range according to the spatial domain data topological distribution in the target topological fusion group;
then, according to the sequence number corresponding to the spatial domain data topological distribution in the target topological fusion group, the target service characteristic information with the sequence number is acquired from the first data area, the second data area or the third data area, and the acquired target service characteristic information is determined as the service characteristic information corresponding to the service data stream. For example, a first information reading request for acquiring the target service feature information in the first data area may be generated according to the sequence number corresponding to the spatial domain data topological distribution in the target topological fusion group. And when the target service characteristic information is not read from the first data area according to the first information reading request, generating a second information reading request for reading the target service characteristic information in the second data area according to the first information reading request. When the target service characteristic information is not acquired from the second data area according to the second information reading request, generating a third information reading request for acquiring the target service characteristic information from the third data area according to the second information reading request, and reading the target service characteristic information from the third data area according to the third information reading request.
In this embodiment, the second data area may be used as an intermediate transition data area, and the various stored service data may respectively set different aging periods according to different service characteristic information of the various service data to perform corresponding data management. For example, the target service characteristic information written in the second data area may include first service characteristic information and second service characteristic information. Then, a first aging period may be set for the first service characteristic information, and a second aging period may be set for the second service characteristic information; the first aging period is different from the second aging period; the first service characteristic information may be deleted from the second data area at the first time when a first aging period expires, and the second service characteristic information may be deleted from the second data area at the second time when a second aging period expires.
Based on the above, in the above, acquiring the target service feature information from the third data area according to the third information reading request is that, if the number of times of the second information reading requests for the first service feature information acquired after deleting the first service feature information in the second data area is greater than a preset request number, a target information reading request is determined from a plurality of second information reading requests; and then generating a third information reading request based on the target information reading request, acquiring the first service characteristic information from the third data area according to the third information reading request, and adding the first service characteristic information to the second data area again. In addition, the information reading requests except the target information reading request in the plurality of second information reading requests are determined as candidate information reading requests; and respectively acquiring the first service characteristic information for each candidate information reading request from the second data area.
On the basis of the above content, the embodiment of the present invention may further perform machine learning on the machine learning network according to the result of acquiring the service big data, and implement real-time service data stream feature detection on the service data stream acquired by the service acquisition terminal through the learned machine network, and the corresponding manner is exemplarily described as follows.
Step S10, obtaining a plurality of service data stream samples obtained by performing service big data acquisition in advance for the data acquisition range, and obtaining a service data stream sample set.
And step S20, performing machine learning on a preset anti-interference feature detection network according to the service data flow sample, and performing feature detection on the service data flow acquired by each service acquisition terminal through the learned anti-interference feature detection network to obtain service feature information of the corresponding service data flow.
Further, the step S20 may include sub-steps S201-S204, which are described in detail below.
And a substep S201, inputting the service data stream sample set into a preset target service characteristic network for machine learning, and obtaining a learned target service characteristic network.
In this embodiment, the preset target service feature Network may be a reduced-size neural Network (VGG), and the learning manner of the neural Network is not described herein.
And a substep S202, performing target service characteristic detection on the service data flow sample set through the learned target service characteristic network to obtain an initial target service characteristic set of the service data flow sample set.
In this embodiment, the step S202 of performing target service feature detection on the service data stream sample set through the learned target service feature network to obtain an initial target service feature set may be obtained in the following manner.
(1) And aiming at each sample service data flow in the service data flow sample set, acquiring the time-space domain topological distribution of each data segment of the sample service data flow and the time-space domain characteristics of each data segment.
In this embodiment, the time-space domain topology distribution of the data segments may refer to the foregoing corresponding description for step S10, and is not described herein again. The time-space domain features of the data segment may include time-domain features and space-domain features, and the corresponding definitions of the time-domain features and space-domain features may also refer to the above description for step S10.
(2) When an interference data block is determined in the sample service data stream according to the time-space domain topological distribution of the data segments, determining the characteristic difference between the time-space domain characteristic of each data segment corresponding to the non-interference data block of the sample service data stream and the time-space domain characteristic of each data segment corresponding to the interference data block of the sample service data stream according to the time-space domain characteristic of the data segment corresponding to the interference data block of the sample service data stream and the target characteristic detection weight of the data segment corresponding to the interference data block of the sample service data stream, and dividing the time-space domain characteristic of the data segment corresponding to the non-interference data block of the sample service data stream and matched with the time-space domain characteristic of the data segment corresponding to the interference data block into the interference data block. In this embodiment, when the current non-interference data block of the sample service data stream corresponds to a time-space domain feature having a plurality of data segments, the feature difference between the time-space domain features of the data segments corresponding to the current non-interference data block of the sample service data stream is determined according to the time-space domain feature of the data segments corresponding to the interference data block of the sample service data stream and the target feature detection weight thereof, and the time-space domain features of the data segments corresponding to the current non-interference data block are subjected to feature fusion according to the feature difference between the time-space domain features of the data segments. And then, configuring a feature identifier for the data fragment fusion feature obtained by the feature fusion according to the time-space domain feature of the data fragment corresponding to the interference data block of the sample service data stream and the target feature detection weight thereof, and dividing the data fragment fusion feature into the interference data block according to the feature identifier.
In this embodiment, the interference data blocks and the non-interference data blocks may include irregular data blocks and/or regular data blocks, the target feature detection weight is used to characterize a target feature detection degree of time-space domain features of the data segment, and the higher the target feature detection weight is, the greater the target feature detection degree of the time-space domain features of the data segment is, the greater the degree of distinction of contained information is. The feature identifier may be configured to characterize a block adjustment priority of the data segment fusion feature, and the dividing the data segment fusion feature into the interference data blocks according to the feature identifier may be dividing part of the data segment fusion features corresponding to the block adjustment priority corresponding to the feature identifier into the interference data blocks in descending order. The feature difference may be represented by a vector distance (e.g., cosine distance, euclidean distance, etc.) of the feature vector.
In some possible embodiments, the determining, according to the time-space domain characteristics of the data segment corresponding to the interference data block of the sample service data stream and the target characteristic detection weight thereof, the characteristic difference between the time-space domain characteristics of each data segment corresponding to the non-interference data block of the sample service data stream and the time-space domain characteristics of each data segment corresponding to the interference data block of the sample service data stream, and dividing the time-space domain characteristics of the data segment corresponding to the non-interference data block of the sample service data stream, which are matched with the time-space domain characteristics of the data segment corresponding to the interference data block, into the interference data blocks may be implemented by:
firstly, calculating the correlation parameters between the time-space domain characteristics of each data segment corresponding to the non-interference data block of the sample service data stream and the characteristic vectors of the time-space domain characteristics of each data segment corresponding to the interference data block of the sample service data stream;
then, respectively judging whether each correlation parameter reaches a first set parameter threshold value, and dividing the time-space domain characteristics of the data segment corresponding to the non-interference data block of which the correlation parameter reaches the first set parameter threshold value into the interference data block; and the characteristic vector of the time-space domain characteristic of the data segment is a matching result of the time-space domain characteristic and the characteristic identifier of the data segment counted according to the time-space domain characteristic of the data segment corresponding to the interference data block of the sample service data stream and the target characteristic detection weight thereof.
In some possible embodiments, the feature difference between the time-space domain features of the data segments corresponding to the current non-interference data block of the sample service data stream is determined according to the time-space domain features of the data segments corresponding to the interference data block of the sample service data stream and the target feature detection weight thereof, and the time-space domain features of the data segments corresponding to the current non-interference data block are feature fused according to the feature difference between the time-space domain features of the data segments, and the specific implementation manner is as follows:
firstly, calculating the correlation parameters among the characteristic vectors of the time-space domain characteristics of each data segment corresponding to the current non-interference data block of the sample service data stream;
then, aiming at the time-space domain characteristics of a data segment corresponding to the current non-interference data block of the sample service data stream, performing characteristic fusion on the time-space domain characteristics of the data segment and the time-space domain characteristics of all the data segments of which the associated parameters with the characteristic vectors reach a second set parameter threshold value to obtain a data segment fusion characteristic sequence.
(3) And determining sample business data stream segments based on the time-space domain characteristics of the target data segments in the interference data blocks corresponding to the sample business data streams, and integrating the determined sample business data stream segments to obtain an initial target business characteristic set. In this embodiment, the sample service data stream segment may be a sample service data stream segment corresponding to interference data.
Therefore, based on the contents described in (1) to (3), the time-space domain characteristics of the data segments in the interference data block and the non-interference data block can be re-divided, so that the interference data block and the non-interference data block can be considered, and the accuracy of analyzing the business characteristics of the business data stream acquired at the later stage can be improved.
And a substep S203, inputting the initial target service characteristic set into a preset first anti-interference characteristic detection network for machine learning, so as to obtain a first target anti-interference characteristic detection network.
In this embodiment, the first interference prevention feature detection network may be understood as a network having a large parameter amount, and may be understood as a large network. Further, inputting the initial target service feature set into a preset first anti-interference feature detection network for machine learning, to obtain a first target anti-interference feature detection network, wherein an implementation manner is as follows:
and performing machine iteration learning on a preset first anti-interference feature detection network by adopting the initial target service feature set, and determining the first anti-interference feature detection network obtained by the Nth learning as the first target anti-interference feature detection network when a target feature detection result obtained by performing target feature detection on test service data by adopting the first anti-interference feature detection network obtained by the Nth learning reaches a set condition. In this embodiment, the setting result may be preset according to actual requirements, for example, may be 90% to 99%, for example, may be preferably 95%, and is not limited herein.
And a substep S204, performing machine learning on a preset second anti-interference feature detection network based on a joint model training strategy and the first target anti-interference feature detection network to obtain a second target anti-interference feature detection network, so that the parameter quantity of the trained second target anti-interference feature detection network is smaller than that of the first target anti-interference feature detection network.
In this embodiment, the second tamper-proof feature detection network may be understood as a network (small network) having a smaller parameter than the first tamper-proof feature detection network. Based on the above, machine learning is performed on a preset second anti-interference feature detection network based on a joint model training strategy and the first target anti-interference feature detection network to obtain a second target anti-interference feature detection network, which can be implemented in the following ways:
and performing machine learning on a preset second anti-interference feature detection network based on a preset model training evaluation index and the first target anti-interference feature detection network to obtain a second target anti-interference feature detection network.
In this embodiment, the preset model training evaluation index may be a preset loss function, which is not limited herein.
Further, in the process of performing machine learning on a preset second anti-interference feature detection network based on a preset model training evaluation index and the first target anti-interference feature detection network to obtain a second target anti-interference feature detection network: and when the value of the preset model training evaluation index obtained by the ith learning is located in a set numerical value interval, determining a second anti-interference feature detection network obtained by the ith learning as a second target anti-interference feature detection network. It is understood that the set value interval may be an interval close to 0, for example, 0.01 to 0.03, and is not limited herein. In some examples, the learning termination condition of the second interference-free feature detection network may be that a model training evaluation index (e.g., a loss function value) approaches 0.
In this embodiment, the second interference-prevention feature detection network is obtained based on a joint model training strategy, which is essentially based on a large network (large model) to train a small network (small model), so that the parameter of the small network is prevented from being expanded on the premise of ensuring the prediction accuracy of the small network. Therefore, the obtained parameter quantity of the second anti-interference characteristic detection network is reduced relative to the parameter quantity of the first anti-interference characteristic detection network, so that the second anti-interference characteristic detection network can directly run in the service acquisition terminal to realize the characteristic detection of the service data stream on the service acquisition terminal, the detection work of the server is dispersed on each service acquisition terminal to be realized, the burden of the server can be reduced, and the operational capability of each service acquisition terminal is fully exerted. Meanwhile, when the second anti-interference feature detection network is deployed at the service acquisition terminal, the real-time performance of the service data stream target feature detection of the service acquisition terminal can be ensured.
And a substep S205, sending a second target anti-interference feature detection network to the service acquisition terminal, and performing target feature detection on the acquired service data stream according to the second target anti-interference feature detection network through the service acquisition terminal to obtain service feature information of the acquired service data stream.
In this embodiment, the service acquisition terminal may be a mobile phone, a tablet computer, a notebook computer, or other portable terminals, which is not limited herein. In an actual implementation process, the determining of the service characteristic information may be performed by a service acquisition terminal and a server in cooperation, and to achieve this purpose, the performing of the target characteristic detection on the acquired service data stream by the service acquisition terminal and the second target anti-interference characteristic detection network to obtain the service characteristic information of the acquired service data stream, which is described in the substep S205, may be implemented in the following manner.
Firstly, the service acquisition terminal detects the data stream characteristics to be identified corresponding to the target block of the acquired service data stream by the network based on the second target anti-interference characteristics; wherein the target block may be a block in which the collected traffic data stream does not have interference data.
Then, the characteristics of the data stream to be identified sent by the service acquisition terminal are acquired, target service characteristic information matched with the characteristics of the data stream to be identified is acquired in a preset storage space, and the target service characteristic information is determined as the service characteristic information of the acquired service data stream.
In some examples, in order to ensure the accuracy of detecting the target feature of the service data stream, further mining needs to be performed on the feature of the data stream to be identified, and for this purpose, the above-mentioned obtaining, in the preset storage space, the target service feature information that matches the feature of the data stream to be identified may include the following contents.
(a1) Decomposing the data stream features to be identified to obtain a plurality of sub-data stream features, and obtaining space domain feature description information of the plurality of sub-data stream features, and m undetermined feature description sequences corresponding to m continuous target feature detection moments of the plurality of sub-data stream features before the current target feature detection moment, wherein the undetermined feature description sequence of each target feature detection moment comprises undetermined feature descriptions of the sub-data stream features under a plurality of feature identification categories.
(a2) And respectively obtaining a characteristic identification degree offset sequence corresponding to each undetermined characteristic description sequence in the m undetermined characteristic description sequences of the characteristics of each sub-data stream. Each characteristic recognition degree offset sequence comprises characteristic recognition degree offsets of the sub-data stream characteristics under a plurality of characteristic identification categories, and each characteristic recognition degree offset represents an offset between the current characteristic recognition degree and the offset characteristic recognition degree under one characteristic identification category.
(a3) And acquiring the characteristic identification degree offset of each sub-data stream characteristic at the current target characteristic detection moment by utilizing the learned characteristic identification degree adjusting network according to the spatial domain characteristic description information of each sub-data stream characteristic and m characteristic identification degree offset sequences corresponding to the m undetermined characteristic description sequences. The characteristic identification degree adjusting network is obtained by learning a plurality of network learning samples, and each network learning sample comprises space domain characteristic description information of a sub-stream characteristic and a characteristic identification degree offset sequence of m +1 continuous target characteristic detection moments. The characteristic recognition degree offset represents an offset between a current characteristic recognition degree and an offset characteristic recognition degree of the sub-data stream characteristic.
In this embodiment, the feature recognition degree adjustment network may be obtained through learning in the following learning process:
firstly, acquiring a large number of network learning samples from a network learning sample library;
then, through the obtained network learning sample, the characteristic identification degree adjustment network is subjected to multiple times of learning according to set learning parameters, and each learning process comprises the following steps: according to the airspace feature description information and the feature recognition degree offset sequence of the previous m target feature detection moments in the m +1 continuous target feature detection moments, acquiring the feature recognition degree offset of the subdata stream feature of each network learning sample at the m +1 target feature detection moment through the feature recognition degree adjusting network; acquiring a network evaluation index of the characteristic identification degree adjusting network according to the characteristic identification degree offset of the subdata stream characteristic of the network learning sample at the (m + 1) th target characteristic detection moment and the characteristic identification degree offset sequence of the (m + 1) th target characteristic detection moment in the network learning sample; determining whether to continuously learn the characteristic recognition degree adjusting network according to the network evaluation index; and if the learning of the characteristic recognition degree adjusting network is determined to be continued, modifying the network parameters of the characteristic recognition degree adjusting network, and continuing the next learning process through the modified characteristic recognition degree adjusting network.
In this embodiment, the feature recognition degree adjusting network may include a feature noise recognition network layer and a feature segment splicing network layer. Based on this, for each sub-data stream feature, obtaining the feature recognition offset by using the feature recognition degree adjusting network may include: according to the m characteristic identification degree offset sequences, acquiring characteristic noise identification indexes of the characteristics of the sub-data stream through the characteristic noise identification network layer; acquiring a characteristic segment splicing index of the sub-data stream characteristic through the characteristic segment splicing network layer according to the airspace characteristic description information; and based on the network layer transmission parameters of the characteristic noise identification network layer and the characteristic segment splicing network layer, obtaining the characteristic identification degree offset at the current target characteristic detection moment according to the characteristic noise identification index and the characteristic segment splicing index.
(a4) Respectively adjusting the current characteristic identification degree of each sub-data stream characteristic through the characteristic identification degree offset of each sub-data stream characteristic at the current target characteristic detection moment; and determining target sub-data stream characteristics from the plurality of sub-data stream characteristics according to the adjusted current characteristic identification degree of each sub-data stream characteristic, and performing characteristic combination on the data stream characteristics to be identified according to the target sub-data stream characteristics to obtain characteristics to be analyzed for data characteristic analysis.
(a5) And acquiring the pre-stored data stream characteristics with the minimum characteristic difference with the characteristics to be analyzed in a preset storage space, and determining the associated service characteristic information of the pre-stored data stream characteristics as target service characteristic information matched with the characteristics of the data stream to be identified. The preset storage space may set in advance a storage location for storing the relevant service characteristic information of the service data stream according to the specified path.
Therefore, by the above method, the features of the data stream to be recognized can be further mined, so that the features of the data stream to be recognized are combined to obtain features to be analyzed for data feature analysis, and then target service feature information matched with the features of the data stream to be recognized is determined based on the features to be analyzed, so that the accuracy of detecting the target features of the data stream can be ensured as much as possible.
As further shown in fig. 2, the big data collecting apparatus 110 included in the server 100 may include a plurality of software functional modules for implementing the corresponding steps of the business big data collecting method for artificial intelligence machine learning. In detail, in this embodiment, the big data collecting device 110 may include a time-space domain node collecting module 111 and a topology fusion analyzing module 112.
The time-space domain node acquisition module 111 is configured to respectively acquire node data of a time domain node and a space domain node for service data generated in a preset service data acquisition range for acquiring service big data, so as to obtain time-space domain node data. In this embodiment, the time-space domain node data includes time domain node data and space domain node data generated in the service data acquisition range.
The topology fusion analysis module 112 is configured to perform topology fusion analysis on the time-space domain node data to obtain a plurality of service data streams in the service data acquisition range and service characteristic information corresponding to the service data streams, and obtain a service data stream sample set according to the service data streams and the corresponding service characteristic information, so as to perform machine learning as a service data learning sample for machine learning.
It should be understood that the time-space domain node collecting module 111 and the topology fusion analyzing module 112 may be configured to perform the method steps corresponding to step S31 and step S32 shown in fig. 3, and for details and specific implementation of the two modules, reference may be made to the corresponding contents of step S31 and step S32, which are not described in detail herein.
In summary, according to the service big data acquisition method and the server for artificial intelligence machine learning provided by the embodiments of the present invention, node data acquisition of a time domain node and a space domain node is performed respectively on service data generated in a preset service data acquisition range for performing service big data acquisition, so as to obtain time-space domain node data, and then topology fusion analysis is performed on the time-space domain node data, so as to obtain a plurality of service data streams in the service data acquisition range and service feature information corresponding to the service data streams, so as to perform machine learning as a service data learning sample for artificial intelligence machine learning. Therefore, business data can be acquired through two data dimensions of time and space domains, and the business data flow and relevant information acquired through data acquisition can reflect the relevant data characteristics of the business data more accurately through the correlation of the two dimensions, so that the learning effect and the application effect of machine learning or artificial intelligence model training in the later period can be improved.
The embodiments described above are only a part of the embodiments of the present invention, and not all of them. The components of embodiments of the present invention generally described and illustrated in the figures can be arranged and designed in a wide variety of different configurations. Therefore, the detailed description of the embodiments of the present invention provided in the drawings is not intended to limit the scope of the present invention, but is merely representative of selected embodiments of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims. Moreover, all other embodiments that can be made available by a person skilled in the art without inventive step based on the embodiments of the present invention shall fall within the scope of protection of the present invention.

Claims (8)

1. A service big data acquisition method based on time-space domain features is characterized by comprising the following steps:
acquiring a plurality of time domain node service data and a plurality of space domain node service data in a service data acquisition range;
acquiring time domain associated parameters and time domain characteristic differences among the plurality of time domain node service data, and acquiring space domain associated parameters and space domain characteristic differences among the plurality of space domain node service data;
combining the plurality of time domain node service data according to the time domain correlation parameters and the time domain feature difference to obtain time domain data topological distribution in the service data acquisition range; wherein one time domain data topology distribution comprises at least one time domain node service data;
combining the plurality of airspace node service data according to the airspace correlation parameters and the airspace feature difference to obtain the airspace data topological distribution in the service data acquisition range; wherein, one spatial domain data topological distribution comprises at least one spatial domain node service data;
and the time domain data topological distribution and the space domain data topological distribution are used for carrying out topological fusion to obtain a service data flow sample set used for artificial intelligence machine learning.
2. The method of claim 1, further comprising:
performing topological fusion on each time domain data topological distribution and each space domain data topological distribution generated in the service data acquisition range according to the service topological relation between the time domain data topological distribution and the space domain data topological distribution to obtain a plurality of topological distribution fusion groups; the spatial domain data topological distribution in each topological distribution fusion group respectively comprises second spatial domain node service data in the service data acquisition range;
determining the spatial domain data topological distribution which is not matched with the time domain data topological distribution but not subjected to topological fusion as to-be-processed spatial domain data topological distribution, and acquiring first topological distribution description information of the to-be-processed spatial domain data topological distribution according to first spatial domain node service data contained in the to-be-processed spatial domain data topological distribution; the first airspace node service data is contained in the service data acquisition range;
respectively acquiring second topological distribution description information of the spatial domain data topological distribution in each topological distribution fusion group according to second spatial domain node service data included in each topological distribution fusion group;
acquiring characteristic differences between the first topological distribution description information and second topological distribution description information corresponding to each topological distribution fusion group;
determining topological correlation parameters between the spatial domain data topological distribution in each topological distribution fusion group and the spatial domain data topological distribution to be processed respectively according to the characteristic difference corresponding to each topological distribution fusion group;
counting a target topological distribution fusion group with topological correlation parameters not less than a preset correlation parameter threshold, and determining service characteristic information contained in time domain data topological distribution in the target topological distribution fusion group as service characteristic information associated with the spatial domain data topological distribution to be processed;
performing topological fusion on the service characteristic information associated with the spatial domain data topological distribution to be processed and the spatial domain data topological distribution to be processed to obtain a characteristic topological fusion group;
and determining a service data stream in the service data acquisition range and service characteristic information corresponding to the service data stream according to the characteristic topology fusion group and the plurality of topology distribution fusion groups, and obtaining the service data stream sample set according to the service data stream and the corresponding service characteristic information.
3. The method according to claim 2, wherein the topologically fusing each time domain data topological distribution and each space domain data topological distribution generated in the service data acquisition range according to the service topological relation between the time domain data topological distribution and the space domain data topological distribution to obtain a plurality of topological distribution fusion groups comprises:
determining each spatial domain data topological distribution generated in the service data acquisition range as local spatial domain topological distribution, and determining each time domain data topological distribution generated in the service data acquisition range as local time domain topological distribution; the airspace node service data in the local airspace topological distribution are obtained by carrying out data acquisition on a target service node in the service data acquisition range;
acquiring time domain node service data in the target service node; calculating a service data association parameter between time domain node service data in the target service node and each time domain node service data in the local time domain topological distribution, and determining a service topological relation between the local space domain topological distribution and the local time domain topological distribution according to the calculated service data association parameter;
and when the business data correlation parameter is not less than a preset correlation parameter threshold value, performing topological fusion on the local spatial domain topological distribution and the local time domain topological distribution to obtain a plurality of topological distribution fusion groups.
4. The method of claim 2, wherein the traffic data segment of the first spatial domain node traffic data is a plurality; the acquiring the first topological distribution description information of the spatial domain data to be processed according to the first spatial domain node service data included in the spatial domain data to be processed, includes:
acquiring service data description information corresponding to each first airspace node service data in a plurality of first airspace node service data;
acquiring first global service characteristic information corresponding to the plurality of first airspace node service data according to service data description information corresponding to each first airspace node service data;
and determining the first global service characteristic information as the first topology distribution description information.
5. The method of claim 2, wherein the plurality of topologically distributed fused groups comprises topologically distributed fused groups Ri, i is no greater than a total number of the plurality of topologically distributed fused groups; the topology distribution fusion group Ri comprises a plurality of service data fragments of second airspace node service data; the obtaining, according to the second airspace node service data included in each topology distribution fusion group, second topology distribution description information of the airspace data topology distribution in each topology distribution fusion group, respectively, includes:
acquiring service data description information corresponding to each second airspace node service data in a plurality of second airspace node service data included in the topology distribution fusion group Ri;
acquiring second global service characteristic information corresponding to the plurality of second airspace node service data according to the service data description information corresponding to each second airspace node service data;
and determining the second global service characteristic information as second topological distribution description information of the spatial domain data topological distribution in the topological distribution fusion group Ri.
6. The method according to claim 2, wherein the number of the spatial domain data topology distribution to be processed is plural; the method further comprises the following steps:
when the number of the target topological distribution fusion groups is not more than a first preset value, respectively determining the topological distribution fusion group where the airspace data topological distribution with the largest topological correlation parameter among the airspace data topological distributions to be processed is located as a candidate topological fusion group corresponding to each airspace data topological distribution to be processed;
respectively determining the service characteristic information contained in the time domain data topological distribution in the candidate topological fusion group corresponding to each spatial domain data topological distribution to be processed as the candidate service characteristic information corresponding to each spatial domain data topological distribution to be processed;
determining a plurality of data characteristic descriptions corresponding to preset target service characteristic information according to candidate service characteristic information corresponding to each to-be-processed airspace data topological distribution, and acquiring a first statistical result of the plurality of data characteristic descriptions in the service characteristic information contained in the time domain data topological distribution of a plurality of topological distribution fusion groups;
determining a first target data feature description of each to-be-processed airspace data topological distribution aiming at the preset target service feature information according to the first statistical result;
determining the preset target service characteristic information respectively having the first target data characteristic description corresponding to each to-be-processed airspace data topological distribution as service characteristic information associated with each to-be-processed airspace data topological distribution; the plurality of data characteristics describe a second statistical result in the traffic characteristic information associated with each to-be-processed airspace data topological distribution, and the second statistical result is equal to the first statistical result;
when the number of the target topology distribution fusion groups is larger than a second preset value, counting the number of a plurality of data characteristic descriptions of preset target service characteristic information in service characteristic information contained in time domain node service data of the target topology distribution fusion groups; the data feature descriptions are determined according to service feature information contained in time domain data topological distribution in the target topological distribution fusion group;
according to the topological correlation parameters between the spatial domain data topological distribution to be processed and the target topological distribution fusion group and the number, determining a second target data characteristic description of the spatial domain data topological distribution to be processed aiming at the preset target service characteristic information from the plurality of data characteristic descriptions;
and determining the preset target service characteristic information with the second target data characteristic description as service characteristic information associated with the spatial domain data topological distribution to be processed.
7. The method of claim 2, further comprising:
determining service characteristic information contained in time domain data topological distribution in the topological distribution fusion groups as service characteristic information contained in the topological distribution fusion groups;
determining each of the plurality of topology distribution fusion groups and the feature topology fusion group as a target topology fusion group in the service data acquisition range; determining the service characteristic information contained in the target topology fusion group as target service characteristic information;
adding the same sequence number to the target service characteristic information and the spatial domain data topological distribution in the target topological fusion group;
writing the target service characteristic information with the sequence number into a first data area, a second data area and a third data area respectively; the data reading speed of the first data area is greater than that of the second data area; the data reading speed of the second data area is higher than that of the third data area; the information writing amount of the first data area aiming at the target service characteristic information is smaller than that of the second data area aiming at the target service characteristic information; the information writing amount of the second data area aiming at the target service characteristic information is smaller than the information writing amount of the third data area aiming at the target service characteristic information;
the determining the service data stream in the service data acquisition range and the service feature information corresponding to the service data stream according to the feature topology fusion group and the plurality of topology distribution fusion groups includes:
determining the service data flow in the service data acquisition range according to the spatial domain data topological distribution in the target topological fusion group;
generating a first information reading request for acquiring the target service characteristic information in the first data area according to the sequence number corresponding to the spatial domain data topological distribution in the target topological fusion group, and generating a second information reading request for reading the target service characteristic information in the second data area according to the first information reading request when the target service characteristic information is not read from the first data area according to the first information reading request; when the target service characteristic information is not acquired from the second data area according to the second information reading request, generating a third information reading request for acquiring the target service characteristic information from the third data area according to the second information reading request; and reading the target service characteristic information from the third data area according to the third information reading request, and determining the obtained target service characteristic information as service characteristic information corresponding to the service data stream.
8. A server comprising a processor and a machine-readable storage medium coupled to the processor, the machine-readable storage medium storing a program, instructions or code, the processor being configured to execute the program, instructions or code in the machine-readable storage medium to implement the method of any one of claims 1-7.
CN202110833263.0A 2021-01-20 2021-01-20 Service big data acquisition method and server based on time-space domain characteristics Withdrawn CN113569938A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110833263.0A CN113569938A (en) 2021-01-20 2021-01-20 Service big data acquisition method and server based on time-space domain characteristics

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110833263.0A CN113569938A (en) 2021-01-20 2021-01-20 Service big data acquisition method and server based on time-space domain characteristics
CN202110077131.XA CN112801156B (en) 2021-01-20 2021-01-20 Business big data acquisition method and server for artificial intelligence machine learning

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202110077131.XA Division CN112801156B (en) 2021-01-20 2021-01-20 Business big data acquisition method and server for artificial intelligence machine learning

Publications (1)

Publication Number Publication Date
CN113569938A true CN113569938A (en) 2021-10-29

Family

ID=75810746

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202110833263.0A Withdrawn CN113569938A (en) 2021-01-20 2021-01-20 Service big data acquisition method and server based on time-space domain characteristics
CN202110833252.2A Withdrawn CN113569937A (en) 2021-01-20 2021-01-20 Artificial intelligence model machine learning method based on big data and server
CN202110077131.XA Active CN112801156B (en) 2021-01-20 2021-01-20 Business big data acquisition method and server for artificial intelligence machine learning

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN202110833252.2A Withdrawn CN113569937A (en) 2021-01-20 2021-01-20 Artificial intelligence model machine learning method based on big data and server
CN202110077131.XA Active CN112801156B (en) 2021-01-20 2021-01-20 Business big data acquisition method and server for artificial intelligence machine learning

Country Status (1)

Country Link
CN (3) CN113569938A (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257349A (en) * 2021-12-10 2023-06-13 华为技术有限公司 Cluster system management method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102717817B (en) * 2012-06-27 2015-07-01 李志恒 System and method for publishing information at high speed railway platform
CN105095242B (en) * 2014-04-30 2018-07-27 国际商业机器公司 A kind of method and apparatus of label geographic area
CN105930860B (en) * 2016-04-13 2019-12-10 闽江学院 simulation analysis method for classification optimization model of temperature sensing big data in intelligent building
CN107402976B (en) * 2017-07-03 2020-10-30 国网山东省电力公司经济技术研究院 Power grid multi-source data fusion method and system based on multi-element heterogeneous model
CN108549901A (en) * 2018-03-12 2018-09-18 佛山市顺德区中山大学研究院 A kind of iteratively faster object detection method based on deep learning
CN108549937A (en) * 2018-04-24 2018-09-18 厦门中控智慧信息技术有限公司 A kind of knowledge migration method and device of detection network
EP3757895A1 (en) * 2019-06-28 2020-12-30 Robert Bosch GmbH Method for estimating a global uncertainty of a neural network
CN111970356A (en) * 2019-12-27 2020-11-20 孟小峰 Environment monitoring method, device and system based on Internet of things and network communication
CN112269975A (en) * 2020-03-31 2021-01-26 周亚琴 Internet of things artificial intelligence face verification method and system and Internet of things cloud server
CN112085102B (en) * 2020-09-10 2023-03-10 西安电子科技大学 No-reference video quality evaluation method based on three-dimensional space-time characteristic decomposition

Also Published As

Publication number Publication date
CN112801156B (en) 2021-09-10
CN113569937A (en) 2021-10-29
CN112801156A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN109034660B (en) Method and related device for determining risk control strategy based on prediction model
CN112801155B (en) Business big data analysis method based on artificial intelligence and server
CN111614690A (en) Abnormal behavior detection method and device
CN111796957B (en) Transaction abnormal root cause analysis method and system based on application log
CN112347367A (en) Information service providing method, information service providing device, electronic equipment and storage medium
CN111800289B (en) Communication network fault analysis method and device
CN112329816A (en) Data classification method and device, electronic equipment and readable storage medium
CN109636212B (en) Method for predicting actual running time of job
CN109214178A (en) APP application malicious act detection method and device
CN114422267A (en) Flow detection method, device, equipment and medium
WO2022053163A1 (en) Distributed trace anomaly detection with self-attention based deep learning
CN112801156B (en) Business big data acquisition method and server for artificial intelligence machine learning
CN113132523B (en) Call detection model training method and call detection method
CN110928636A (en) Virtual machine live migration method, device and equipment
CN110910241A (en) Cash flow evaluation method, apparatus, server device and storage medium
CN111209105A (en) Capacity expansion processing method, capacity expansion processing device, capacity expansion processing equipment and readable storage medium
CN115567371A (en) Abnormity detection method, device, equipment and readable storage medium
CN114745335A (en) Network traffic classification, device, storage medium, and electronic apparatus
CN111798237B (en) Abnormal transaction diagnosis method and system based on application log
CN110704614B (en) Information processing method and device for predicting user group type in application
CN111835541B (en) Method, device, equipment and system for detecting aging of flow identification model
CN115757002A (en) Energy consumption determination method, device and equipment and computer readable storage medium
CN113626340A (en) Test requirement identification method and device, electronic equipment and storage medium
CN115858911A (en) Information recommendation method and device, electronic equipment and computer-readable storage medium
CN112582080A (en) Internet of things equipment state monitoring method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20211029

WW01 Invention patent application withdrawn after publication