CN116821502B

CN116821502B - Public opinion hotspot-based data management method and system

Info

Publication number: CN116821502B
Application number: CN202310801706.7A
Authority: CN
Inventors: 游士兵; 刘多晨曦; 张力; 张厚力; 张苗
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2024-03-08
Anticipated expiration: 2043-06-30
Also published as: CN116821502A

Abstract

The invention provides a data management method and system based on public opinion hotspots, wherein the method comprises the following steps: and acquiring hot spot comments of the network hot spots, and updating the data classification standard according to the hot spot comments, so as to acquire recommended data packets related to the network hot spots. The invention has the beneficial effects that: the network data is recommended to the user from the perspective of the network hotspots, so that the user can follow the hotspots of the age, and the requirements of the user are met.

Description

Public opinion hotspot-based data management method and system

Technical Field

The invention relates to the field of data management, in particular to a data management method and system based on public opinion hotspots.

Background

With the rapid development of the computer internet industry, short video platforms such as voice trembling and fast handholding are rapidly developed, a large amount of network data (short video, web pages and the like) are generated on social media, and browsing the network data gradually becomes a main mode of surfing the internet by people's mobile phones.

At present, in order to improve user access experience, a suitable information pushing technology is presented according to user preference, related information is pushed to users according to information frequently browsed by users at ordinary times, however, such technology can lead users to shield some hot spot information, and internet surfing is mainly for the time of the tight hotspots, so that a data management method based on public opinion hotspots is needed.

Disclosure of Invention

The invention mainly aims to provide a data management method and system based on public opinion hotspots, and aims to solve the problem that a user can shield some hotspot information due to the fact that a proper information pushing technology is carried out according to user preferences.

The invention provides a data management method based on public opinion hotspots, which comprises the following steps:

acquiring a plurality of hotspot comments related to each network hotspot based on the network hotspot;

obtaining public opinion comment words of each preset dimension from each hotspot comment;

converting each public opinion comment word into a dimension value according to a preset conversion relation;

extracting the dimension values corresponding to the hotspot evaluation according to each preset dimension to obtain a dimension value set corresponding to each preset dimension;

calculating fluctuation index values of the dimension value sets; the fluctuation index value is used for reflecting the fluctuation condition in the dimension value set;

selecting a preset dimension with the fluctuation index value larger than the preset index value as a target dimension;

acquiring a first data classification standard before updating, and updating the first data classification standard based on the target dimension to obtain a second data classification standard;

acquiring a plurality of network data from a preset database, and setting two linear functions by adopting a preset linear classifier aiming at the same target dimensionWherein b _t ＝b _t-1 +m _t And b ₁ ＝m ₁ ，m _t Representing constants associated with the classification criteria, b _t Representing the offset of the t-th linear function, b _t-1 The offset of the t-1 linear function is represented, t is a positive integer, w represents a weight vector, the number of dimensions is the same as the number of the dimension value sets, and f _t (x) The method is characterized in that the method comprises the steps of expressing a t-th linear function, wherein x represents network data, and W is a preset parameter;

the Euclidean distance between each linear function and each network data is calculated, the maximum Euclidean distance and the minimum Euclidean distance of each linear function are extracted, and the difference between the maximum Euclidean distance and the minimum Euclidean distance is used as the information distance of the corresponding linear function;

according to the formulaTransformation parameter A for calculating information distance of two linear functions of same target dimension _i The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is _n Represents the nth information distance, T (T _n ) The representation is based on t _n Is a preset calculation function of (1);

judging the transformation parameter A _i Whether within a preset range;

and if the network hot spot related recommendation data packet is within the preset range, selecting a plurality of network data from a network database based on two linear functions so as to form the network hot spot related recommendation data packet.

Further, the step of calculating a fluctuation index value of each of the dimension value sets includes:

according to the formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein E is _i The fluctuation index value representing the ith set of dimension values, when p _ij When=0, lim is defined _pij→0 p _ij lnp _ij ＝0，p _ij Representing the intermediate value corresponding to the j-th hotspot comment in the i-th dimension value set,/->Representing the standard value corresponding to the j-th hot spot comment in the i-th dimension value set, n represents the number of hot spot comments, and x _ij Represents the dimension value, min (x _ij ) And max (x) _ij ) Representing the minimum and maximum values in the ith set of dimension values, respectively.

Further, the step of calculating the euclidean distance between each linear function and the respective network data includes:

inputting the fluctuation index value and the network hotspots into a preset function acquisition model to obtain a corresponding conversion function; the function acquisition model is trained by taking different network hot spots and index values as inputs and corresponding conversion functions as outputs;

performing space mapping on the network data according to the conversion function to obtain target network data mapped by each network data;

and calculating Euclidean distance between each linear function and corresponding network data based on the target network data.

Further, the determination of the transformation parameter A _i After the step of determining whether the detected value is within the preset range, the method further comprises the following steps:

if the transformation parameters are not in the preset range, adjusting weight vectors in the linear function until the transformation parameters are in the preset range, so as to obtain two target linear functions;

and selecting a plurality of network data from a network database based on the two target linear functions to form the network hotspot-related recommendation data packet.

Further, if the network hot spot related recommendation data packet is within the preset range, the step of selecting a plurality of network data from a network database based on two linear functions to form the network hot spot related recommendation data packet further includes:

acquiring time information of each selected network data;

setting a priority order for each network data based on the time information;

and orderly pushing each network data to the user based on the priority order.

The invention also provides a data management system based on public opinion hotspots, which comprises:

the first acquisition module is used for acquiring a plurality of hotspot comments related to each network hotspot based on the network hotspot;

the second acquisition module is used for acquiring public opinion comment words with each preset dimension from each hot spot comment;

the conversion module is used for converting each public opinion comment word into a dimension value according to a preset conversion relation;

the extraction module is used for extracting the dimension values corresponding to each hotspot evaluation according to each preset dimension to obtain a dimension value set corresponding to each dimension;

the first calculation module is used for calculating fluctuation index values of the dimension value sets; the fluctuation index value is used for reflecting the fluctuation condition in the dimension value set;

the first selecting module is used for selecting a preset dimension with a fluctuation index value larger than a preset index value as a target dimension;

the third acquisition module is used for acquiring a first data classification standard before updating and updating the first data classification standard based on the target dimension to obtain a second data classification standard;

a fourth obtaining module, configured to obtain a plurality of network data from a preset database, and set two linear functions by using a preset linear classifier for the same target dimensionWherein b _t ＝b _t-1 +m _t And b ₁ ＝m ₁ ，m _t Representing constants associated with the classification criteria, b _t Representing the offset of the t-th linear function, b _t-1 The offset of the t-1 linear function is represented, t is a positive integer, w represents a weight vector, the number of dimensions is the same as the number of the dimension value sets, and f _t (x) The method is characterized in that the method comprises the steps of expressing a t-th linear function, wherein x represents network data, and W is a preset parameter;

the second calculation module is used for calculating the Euclidean distance between each linear function and each network data, extracting the maximum Euclidean distance and the minimum Euclidean distance of each linear function, and taking the difference between the maximum Euclidean distance and the minimum Euclidean distance as the information distance of the corresponding linear function;

a third calculation module for calculating according to the formulaTransformation parameter A for calculating information distance of two linear functions of same target dimension _i The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is _n Represents the nth information distance, T (T _n ) The representation is based on t _n Is a preset calculation function of (1);

a judging module for judging the transformation parameter A _i Whether within a preset range;

and the second selecting module is used for selecting a plurality of network data from a network database based on two linear functions if the network data are in a preset range so as to form the network hotspot related recommended data packet.

Further, the second computing module includes:

the input sub-module is used for inputting the fluctuation index value and the network hot spots into a preset function acquisition model to obtain a corresponding conversion function; the function acquisition model is trained by taking different network hot spots and index values as inputs and corresponding conversion functions as outputs;

the mapping sub-module is used for carrying out space mapping on the network data according to the conversion function to obtain target network data after mapping each network data;

and the calculation sub-module is used for calculating the Euclidean distance between each linear function and the corresponding network data based on the target network data.

Further, the first computing module includes:

a fluctuation index value calculation sub-module for calculating a fluctuation index value according to a formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein E is _i The fluctuation index value representing the ith set of dimension values, when p _ij When=0, lim is defined _pij→0 p _ij lnp _ij ＝0，p _ij Represents the ithIntermediate value corresponding to j-th hotspot comment in each dimension value set,/for each hotspot comment>Representing the standard value corresponding to the j-th hot spot comment in the i-th dimension value set, n represents the number of hot spot comments, and x _ij Represents the dimension value, min (x _ij ) And max (x) _ij ) Representing the minimum and maximum values in the ith set of dimension values, respectively.

Further, the public opinion hotspot-based data management system further comprises:

the adjustment module is used for adjusting the weight vector in the linear function if the transformation parameter is not in the preset range, and obtaining two target linear functions until the transformation parameter is in the preset range;

and the network data selection module is used for selecting a plurality of network data from a network database based on the two target linear functions so as to form the network hotspot related recommendation data packet.

the time information acquisition module is used for acquiring the time information of each selected network data;

a priority order setting module, configured to set a priority order for each network data based on the time information;

and the pushing module is used for orderly pushing each network data to the user based on the priority order.

The invention has the beneficial effects that: by acquiring the hotspot comments of the network hotspots and updating the data classification standard, the recommended data packet related to the network hotspots is acquired, and the network data is recommended to the user from the perspective of the network hotspots, so that the user can follow the hotspots in the era, and the requirements of the user are met.

Drawings

Fig. 1 is a flow chart of a data management method based on public opinion hotspots according to an embodiment of the invention;

fig. 2 is a schematic block diagram of a data management system based on public opinion hotspot according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that, in the embodiments of the present invention, all directional indicators (such as up, down, left, right, front, and back) are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), if the specific posture is changed, the directional indicators correspondingly change, and the connection may be a direct connection or an indirect connection.

The term "and/or" is herein merely an association relation describing an associated object, meaning that there may be three relations, e.g., a and B, may represent: a exists alone, A and B exist together, and B exists alone.

Furthermore, descriptions such as those referred to as "first," "second," and the like, are provided for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implying an order of magnitude of the indicated technical features in the present disclosure. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

Referring to fig. 1, the present invention provides a data management method based on public opinion hotspots, including:

s1: acquiring a plurality of hotspot comments related to each network hotspot based on the network hotspot;

s2: obtaining public opinion comment words of each preset dimension from each hotspot comment;

s3: converting each public opinion comment word into a dimension value according to a preset conversion relation;

s4: extracting the dimension values corresponding to the hotspot evaluation according to each preset dimension to obtain a dimension value set corresponding to each dimension;

s5: calculating fluctuation index values of the dimension value sets; the fluctuation index value is used for reflecting the fluctuation condition in the dimension value set;

s6: selecting a preset dimension with the fluctuation index value larger than the preset index value as a target dimension;

s7: acquiring a first data classification standard before updating, and updating the first data classification standard based on the target dimension to obtain a second data classification standard;

s8: acquiring a plurality of network data from a preset database, and setting two linear functions by adopting a preset linear classifier aiming at the same target dimensionWherein b _t ＝b _t-1 +m _t And b ₁ ＝m ₁ ，m _t Representing constants associated with the classification criteria, b _t Representing the offset of the t-th linear function, b _t-1 The offset of the t-1 linear function is represented, t is a positive integer, w represents a weight vector, the number of dimensions is the same as the number of the dimension value sets, and f _t (x) The method is characterized in that the method comprises the steps of expressing a t-th linear function, wherein x represents network data, and W is a preset parameter;

s9: the Euclidean distance between each linear function and each network data is calculated, the maximum Euclidean distance and the minimum Euclidean distance of each linear function are extracted, and the difference between the maximum Euclidean distance and the minimum Euclidean distance is used as the information distance of the corresponding linear function;

s10: according to the formulaTransformation parameter A for calculating information distance of two linear functions of same target dimension _i The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is _n Represents the nth information distance, T (T _n ) The representation is based on t _n Is a preset calculation function of (1);

s11: judging the transformation parameter A _i Whether within a preset range;

s12: and if the network hot spot related recommendation data packet is within the preset range, selecting a plurality of network data from a network database based on two linear functions so as to form the network hot spot related recommendation data packet.

As described in step S1, a plurality of hot comments related to each network hot are obtained based on the network hot, where the network hot is a term related to hot news, such as world cup, mexico related hot news, short video, etc., and specific hot may be obtained from the top of each APP, such as the top of a microblog, and after the corresponding network hot is obtained, the corresponding hot comment contains most of the things focused by the user, so that the corresponding hot comment is obtained as a basis for pushing data, and the obtaining manner may directly obtain some comments in high praise from the comment area.

As described in step S2, the public opinion comment words of each preset dimension are obtained from each hotspot comment, and it is to be noted that, the preset dimensions are all predetermined dimensions, for example, football, basketball, etc., and the corresponding public opinion comment words may be positive words or negative words, and may be calculated according to polarity of part of speech.

As described in step S3, each public opinion comment word is converted into a dimension value according to a preset conversion relationship, specifically, a correspondence between the public opinion comment word and the dimension value may be preset, and then the corresponding dimension value may be obtained according to the public opinion comment word.

Step S4-S6, extracting the dimension values corresponding to the hot spot evaluation according to each preset dimension to obtain dimension value sets corresponding to each dimension, and calculating fluctuation index values of the dimension value sets; the fluctuation index value is used for reflecting the fluctuation condition in the dimension value set. When a hotspot is reviewed multiple times and has multiple different attitudes, i.e. the fluctuation index value is larger, the hotspot is considered to have a disputed point, and thus, the user is very interested in such a hotspot, and therefore, the preset dimension of the fluctuation index value larger than the preset index value is selected as the target dimension, so that the corresponding network data (i.e. short video or webpage, etc.) can be conveniently screened out. The fluctuation index value may be a variance or an average difference in the dimension value set, or may be calculated by other calculation methods, which will be described in detail later, and will not be described herein.

As described in the above step S7, the first data classification criterion is a criterion set according to the browsing condition of the user, or may be a criterion set based on a previous hotspot, it should be noted that the hotspot generally exists and lasts for a period of time, and when the hotspot is updated, the corresponding update is triggered, where the first data classification criterion is the last classification criterion, the first classification criterion may be set manually, and then each time the update of the next classification criterion is triggered according to the real-time hotspot, specifically, after the target dimension is acquired, the elements in the dimension value set corresponding to the target dimension are summed and averaged to obtain a mean value, and the mean value covers the corresponding dimension value in the first data classification criterion to obtain the second data classification criterion, and of course, it should be noted that the target dimension may have multiple dimensions, and therefore, the corresponding modified classification criterion also has multiple dimensions.

As described in step S8, a plurality of network data are obtained from a preset database, and two linear functions are set by adopting a preset linear classifier for the same target dimension, where the preset database is a corresponding APP database, for example, may be a short video database, that is, the network data are short videos, and the common linear classifier includes: bayesian classification, linear regression, LR, SVM (linear kernel), single-layer perceptron, etc. I.e. if it is an N-dimensional space, the resulting normalized linear function is a hyperplane. A linear classifier uses a "hyperplane" to isolate positive and negative samples, such as: the positive and negative samples on the two-dimensional plane are classified by a straight line; the positive and negative samples in the three-dimensional space are classified by a plane; positive and negative samples in the N-dimensional space are classified by a hyperplane. Two linear functions are provided to define a range, thereby facilitating the screening of the database for the desired network data.

As described in step S9, the euclidean distance between each linear function and each network data is calculated, the maximum euclidean distance and the minimum euclidean distance of each linear function are extracted, the difference between the maximum euclidean distance and the minimum euclidean distance is used as the information distance of the corresponding linear function, the euclidean distance between each normalized linear function and each target order is calculated, the maximum euclidean distance and the minimum euclidean distance of each normalized linear function are extracted, and the maximum euclidean distance minus the minimum euclidean distance is used as the information distance of the corresponding normalized linear function. The Euclidean distance is calculated as the distance of the hyperplane, which is the normalized linear function, for each target order value.

As described in the above steps S10-S12, the method is according to the formulaTransformation parameter A for calculating information distance of two linear functions of same target dimension _i The method comprises the steps of carrying out a first treatment on the surface of the Judging the transformation parameter A _i Whether within a preset range; and if the network hot spot related recommendation data packet is within the preset range, selecting a plurality of network data from a network database based on two linear functions so as to form the network hot spot related recommendation data packet. Wherein t is _n Represents the nth information distance, T (T _n ) The representation is based on t _n In particular, T (T _n )＝at _n The +b, a and b are constants, and it is required to be noted that the closer the corresponding transformation parameter of the pairwise linear function is to 1, the better the classification effect is, whereas the worse the classification effect is, the better the classification result is, the higher the screening precision is, so that network data based on public opinion hotspots can be obtained, and sharing is convenientThe user can recommend the network data for the user from the perspective of the network hot spot, so that the user can follow the hot spot of the age, and the requirement of the user is met.

In one embodiment, the step S9 of calculating the euclidean distance between each linear function and the respective network data includes:

s901: inputting the fluctuation index value and the network hotspots into a preset function acquisition model to obtain a corresponding conversion function; the function acquisition model is trained by taking different network hot spots and index values as inputs and corresponding conversion functions as outputs;

s902: performing space mapping on the network data according to the conversion function to obtain target network data mapped by each network data;

s903: and calculating Euclidean distance between each linear function and corresponding network data based on the target network data.

As described in the above steps S901-S903, the processing of the data is achieved, and since the function that may be obtained in the actual calculation process is not linear, but may be mapped to the feature space, and the corresponding linear function may be obtained in the feature space, a suitable conversion function may be obtained, specifically, the linear classification may be directly performed, and then the mapping may be directly performed, that is, the variation of the conversion function is 1. Common transfer functions have linear transfer functions, polynomial transfer functions, gauss radial basis transfer functions, and the like.

In one embodiment, the step S5 of calculating a fluctuation index value of each of the dimension value sets includes:

s501: according to the formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein E is _i Representing the ith set of dimension valuesThe fluctuation index value is summed, when p _ij When=0, lim is defined _pij→0 p _ij lnp _ij ＝0，p _ij Representing the intermediate value corresponding to the j-th hotspot comment in the i-th dimension value set,/->Representing the standard value corresponding to the j-th hot spot comment in the i-th dimension value set, n represents the number of hot spot comments, and x _ij Represents the dimension value, min (x _ij ) And max (x) _ij ) Representing the minimum and maximum values in the ith set of dimension values, respectively.

As described in the above step S501, the calculation of the fluctuation index value is realized by firstly obtaining the maximum value and the minimum value in the dimension value set, and reflecting the data fluctuation condition of the whole dimension value set according to the maximum value and the minimum value, namely firstly according to the formulaAnd calculating the standard value corresponding to each dimension value, namely firstly carrying out standard processing on each dimension value and carrying out normalization processing on each dimension value, so as to avoid deviation of a calculation result caused by overlarge data. Then according to the probability p of the occurrence of the standard deviation corresponding to each dimension value _ij And calculating the fluctuation index value of the j-th dimension value set. The fluctuation index value calculated according to the calculation formula fully considers the fluctuation condition of the numerical values of each dimension in the same dimension, and fully considers the influence of the extreme individual value on the whole fluctuation index value, so that the calculated fluctuation index value has more referential property.

In one embodiment, the determining the transformation parameter A _i After step S11, whether the detected value is within the preset range, the method further includes:

s1201: if the transformation parameters are not in the preset range, adjusting weight vectors in the linear function until the transformation parameters are in the preset range, so as to obtain two target linear functions;

s1202: and selecting a plurality of network data from a network database based on the two target linear functions to form the network hotspot-related recommendation data packet.

As described in the above steps S1201-S1202, when the linear function is not within the preset range, the linear function setting may be considered unreasonable, and therefore, the weight vector needs to be adjusted, the adjustment should be performed in a manner that follows the corresponding standard adjustment until the transformation parameters are within the preset range, so as to obtain two target linear functions, and then multiple network data are selected from the network database based on the two target linear functions, so as to form the network hotspot-related recommended data packet. The selection manner is the same as the selection manner described above, and will not be described here again.

In one embodiment, after the step S12 of selecting a plurality of network data from a network database based on two linear functions to form the network hotspot-related recommendation data packet if the network hotspot-related recommendation data packet is within the preset range, the method further includes:

s1301: acquiring time information of each selected network data;

s1302: setting a priority order for each network data based on the time information;

s1303: and orderly pushing each network data to the user based on the priority order.

As described in the above steps S1301-S1303, the sequential selection of pushing the network data is achieved, so that the user experience is better, specifically, the time information of the selected network data is obtained, that is, the time information may be uploaded time information, if the time information is a short video, or may be time information when the short video is shot or made, the application is not limited to this, and after the large corresponding time information is obtained, a priority order may be set for the network data, for example, the network data closest to the current time may be set to be sent to the user first, so that the user may browse the latest information, and thus may follow the hot spot.

Referring to fig. 2, the present invention further provides a data management system based on public opinion hotspots, including:

the first obtaining module 10 is configured to obtain a plurality of hotspot comments related to each network hotspot based on the network hotspot;

the second obtaining module 20 is configured to obtain public opinion comment words in each preset dimension from each hotspot comment;

the conversion module 30 is configured to convert each public opinion comment word into a dimension value according to a preset conversion relationship;

the extracting module 40 is configured to extract the dimension values corresponding to each hotspot evaluation according to each preset dimension, so as to obtain a dimension value set corresponding to each dimension;

a first calculation module 50, configured to calculate a fluctuation index value of each of the dimension value sets; the fluctuation index value is used for reflecting the fluctuation condition in the dimension value set;

a first selection module 60, configured to select a preset dimension, in which the fluctuation index value is greater than the preset index value, as a target dimension;

a third obtaining module 70, configured to obtain a first data classification standard before updating, and update the first data classification standard based on the target dimension, so as to obtain a second data classification standard;

a fourth obtaining module 80, configured to obtain a plurality of network data from a preset database, and set two linear functions by using a preset linear classifier for the same target dimensionWherein b _t ＝b _t-1 +m _t And b ₁ ＝m ₁ ，m _t Representing constants associated with the classification criteria, b _t Representing the offset of the t-th linear function, b _t-1 The offset of the t-1 th linear function is represented, t isPositive integer, w represents weight vector, and dimension is the same as the number of dimension value sets, f _t (x) The method is characterized in that the method comprises the steps of expressing a t-th linear function, wherein x represents network data, and W is a preset parameter;

a second calculation module 90, configured to calculate the euclidean distance between each linear function and each network data, extract the maximum euclidean distance and the minimum euclidean distance of each linear function, and use the difference between the maximum euclidean distance and the minimum euclidean distance as the information distance of the corresponding linear function;

a third calculation module 100 for calculating a third calculation result according to the formulaTransformation parameter A for calculating information distance of two linear functions of same target dimension _i The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is _n Represents the nth information distance, T (T _n ) The representation is based on t _n Is a preset calculation function of (1);

a judging module 110 for judging the transformation parameter A _i Whether within a preset range;

the second selecting module 120 is configured to select a plurality of network data from a network database based on two linear functions if the network data is within a preset range, so as to form the network hotspot-related recommendation data packet.

In one embodiment, the second computing module 90 includes:

In one embodiment, the first computing module 50 includes:

a fluctuation index value calculation sub-module for calculating a fluctuation index value according to a formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein E is _i The fluctuation index value representing the ith set of dimension values, when p _ij When=0, lim is defined _pij→0 p _ij lnp _ij ＝0，p _ij Representing the intermediate value corresponding to the j-th hotspot comment in the i-th dimension value set,/->Representing the standard value corresponding to the j-th hot spot comment in the i-th dimension value set, n represents the number of hot spot comments, and x _ij Represents the dimension value, min (x _ij ) And max (x) _ij ) Representing the minimum and maximum values in the ith set of dimension values, respectively.

In one embodiment, the public opinion hotspot-based data management system further includes:

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided herein and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. The data management method based on the public opinion hotspot is characterized by comprising the following steps of:

the step of calculating the fluctuation index value of each dimension value set comprises the following steps:

according to the formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein->Indicate->The fluctuation index value of each dimension value set whenDefinition +.>，/>Indicate->The +.>Intermediate value corresponding to each hotspot comment, +.>Indicate->The +.>Standard values corresponding to the hot spot comments, n represents the number of the hot spot comments,/-the hot spot comments>Indicate->The +.>Dimension values corresponding to the hot spot comments, +.>And->Respectively represent +.>Minimum and maximum values in the respective sets of dimension values;

acquiring a plurality of network data from a preset database, and setting two linear functions by adopting a preset linear classifier aiming at the same target dimensionThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->，/>Representing a constant related to the classification criterion, +.>Bias representing the linear function of the t-th bar, < >>The offset of the t-1 th linear function is represented, t is a positive integer, ++>Representing weight vectors, and the number of dimensions is the same as the number of sets of dimension values, +.>Represents the t-th linear function,>representing network data->Is a preset parameter;

the Euclidean distance between each linear function and each network data is calculated, the maximum Euclidean distance and the minimum Euclidean distance of each linear function are extracted, and the difference between the maximum Euclidean distance and the minimum Euclidean distance is used as the information distance of the corresponding linear function; the step of calculating the Euclidean distance between each linear function and each network data comprises the following steps:

calculating Euclidean distance between each linear function and corresponding network data based on the target network data;

according to the formulaCalculating transformation parameters of information distances of two linear functions of the same target dimensionThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Represents the nth information distance,/->The representation is based on->Is a preset calculation function of (1);

determining the transformation parametersWhether within a preset range;

2. The public opinion hotspot-based data management method of claim 1, wherein the determining the transformation parametersAfter the step of determining whether the detected value is within the preset range, the method further comprises the following steps:

3. The public opinion hotspot-based data management method of claim 1, wherein the step of selecting a plurality of network data from a network database based on two linear functions to form the network hotspot-related recommended data packet if the network data is within a preset range further comprises:

acquiring time information of each selected network data;

setting a priority order for each network data based on the time information;

and orderly pushing each network data to the user based on the priority order.

4. A public opinion hotspot-based data management system, comprising:

the first computing module includes:

a fluctuation index value calculation sub-module for calculating a fluctuation index value according to a formulaCalculating a fluctuation index value of each of said sets of dimension values, wherein +.>Wherein->Indicate->Said fluctuation index value of each dimension value set when +.>Definition +.>，/>Indicate->The +.>Intermediate value corresponding to each hotspot comment, +.>Indicate->The +.>Standard values corresponding to the hot spot comments, n represents the number of the hot spot comments,/-the hot spot comments>Indicate->The +.>Dimension values corresponding to the hot spot comments, +.>And->Respectively representFirst->Minimum and maximum values in the respective sets of dimension values;

a fourth obtaining module, configured to obtain a plurality of network data from a preset database, and set two linear functions by using a preset linear classifier for the same target dimensionThe method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>And->，/>Representing a constant related to the classification criterion, +.>Bias representing the linear function of the t-th bar, < >>The offset of the t-1 th linear function is represented, t is a positive integer, ++>Representing weight vectors, and the number of dimensions is the same as the number of sets of dimension values, +.>Representing the t-th linearityFunction (F)>Representing network data->Is a preset parameter;

the second computing module includes:

a calculation sub-module for calculating Euclidean distance between each linear function and corresponding network data based on the target network data; a third calculation module for calculating according to the formulaTransformation parameters for calculating the information distance of two linear functions of the same target dimension>The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Represents the nth information distance,/->The representation is based on->Is a preset calculation function of (1);

a judging module for judging the transformation parametersWhether within a preset range;

5. The public opinion hotspot-based data management system of claim 4, further comprising:

6. The public opinion hotspot-based data management system of claim 4, further comprising: