CN106033438A - Public sentiment data storage method and server - Google Patents

Public sentiment data storage method and server Download PDF

Info

Publication number
CN106033438A
CN106033438A CN201510111930.9A CN201510111930A CN106033438A CN 106033438 A CN106033438 A CN 106033438A CN 201510111930 A CN201510111930 A CN 201510111930A CN 106033438 A CN106033438 A CN 106033438A
Authority
CN
China
Prior art keywords
data
topic
public sentiment
stored
pending
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510111930.9A
Other languages
Chinese (zh)
Other versions
CN106033438B (en
Inventor
荆艳影
张丹
杨建武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Peking University
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University
Priority to CN201510111930.9A priority Critical patent/CN106033438B/en
Publication of CN106033438A publication Critical patent/CN106033438A/en
Application granted granted Critical
Publication of CN106033438B publication Critical patent/CN106033438B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides a public sentiment data storage method and a server. The method comprises: obtaining to-be-stored public sentiment data, determining and storing a topic identifier, a data identifier, a showing field, and a sorting field corresponding to the to-be-stored public sentiment data in a cache in an association manner, to obtain to-be-processed cache data; when a special subject identifier corresponding to the topic identifier of the to-be-processed cache data is determined not to exist, storing the topic identifier, the data identifier, and the sorting field of the to-be-processed cache data in a recent database in an association manner; storing the to-be-stored public sentiment data in a historical database; storing the topic identifier, the data identifier, and creation time of the to-be-processed cache data in a real-time database in a first topic storage format; and storing the topic identifier and the showing field of the to-be-processed cache data in the real-time database in a second topic storage format. The method realizes to store different information of public sentiment data in the recent database, the historical database, ad the real-time database in sequence in a graded manner.

Description

Public sentiment data storage method and server
Technical field
The invention belongs to areas of information technology, especially relate to a kind of public sentiment data storage method and server.
Background technology
Network public-opinion is the tendency held some focus, hot issue by the public of transmission on Internet Property viewpoint and speech, mainly by forum, blog, news follow-up, turn the forms such as note and propagate.Due to mutually The features such as virtual, disguised, the diversity of networking and permeability, increasing people are ready to pass through Individual's view to public sentiment event expressed by this platform of the Internet.
The public sentiment data message reflection public is to various focus incidents or perhaps the viewpoint of public sentiment special topic, logical Cross the identification to variant public sentiment special topic and to each public sentiment special topic at the public sentiment topic of different time sections Identify, such as related governmental departments, application service provider etc. can be helped to understand the current public in time and close The focus of note and viewpoint, in order to by the analysis of the public sentiment data to real-time or a period of time, according to analysis Result is applied accordingly.But, public sentiment data is analyzed premise be by public sentiment data reliable, Effectively storage.
Widely available along with internet, applications, the magnanimity of public sentiment data is more and more prominent, and people are altogether While enjoying magnanimity public sentiment data information, the problem such as storage being also faced with magnanimity public sentiment data.People in the past When structured data, often use mode data being directly stored in data base, but, When in the face of the public sentiment data of magnanimity of burst, directly carry out to cause the data storage can into database purchase Being severely impacted by property, therefore, the storage of magnanimity public sentiment data becomes the bottleneck problem of system design.
Summary of the invention
For the problem of above-mentioned existence, the present invention provides a kind of public sentiment data storage method and server, To realize the reliability storage of public sentiment data.
The invention provides a kind of public sentiment data storage method, including:
Obtain public sentiment data to be stored, distribute Data Identification for described public sentiment data to be stored, and according in advance If each topic expression formula determine described public sentiment data to be stored corresponding topic mark;
Resolve and obtain displaying field corresponding to described public sentiment data to be stored and sort field, by described data Mark, described topic mark, described displaying field associate with described sort field and are stored in the slow of server In depositing, obtain pending data cached;Wherein, described displaying field includes described public sentiment data to be stored Establishment time, founder and data content, described sort field includes described public sentiment data to be stored Hop count and/or comment number of times;
Obtain described pending data cached from described caching, according to default thematic topic corresponding relation, Determine whether there is the special topic mark corresponding with described pending data cached topic mark;
If there is not described special topic mark, then by described pending data cached described topic mark, institute State Data Identification to associate with described sort field in the Recent data storehouse being stored in described server, described closely Phase data base is used for storing described pending data cached to Dai-ichi Mutual Life Insurance duration;
In the pending data cached historical data base being stored in described server after extending, described history Data base is pending data cached to the second life duration for store after described extension, and described second is raw The length of life duration more than the length of described Dai-ichi Mutual Life Insurance duration, pending data cached after described extension Include in described pending data cached and described public sentiment data to be stored except described displaying field and described Other fields outside sort field;
To preset the first described pending data cached described topic mark of topic storage format storage, institute State Data Identification and described establishment time in the real-time data base of described server;To preset the second topic The described pending data cached described topic mark of storage format storage and described displaying field are to described reality Time data base in, described real-time data base is used for storing described pending data cached to third life duration, The length of described third life duration is less than the length of described Dai-ichi Mutual Life Insurance duration.
The invention provides a kind of server, including:
Acquisition module, is used for obtaining public sentiment data to be stored, distributes data for described public sentiment data to be stored Mark, and determine, according to default each topic expression formula, the topic mark that described public sentiment data to be stored is corresponding;
Buffer process module, obtains displaying field corresponding to described public sentiment data to be stored and row for resolving Sequence field, closes described Data Identification, described topic mark, described displaying field and described sort field Connection is stored in the caching of server, obtains pending data cached;Wherein, described displaying field includes Establishment time, founder and the data content of described public sentiment data to be stored, described sort field includes institute State hop count and/or the comment number of times of public sentiment data to be stored;
Determine module, described pending data cached for obtaining from described caching, according to default special Topic topic corresponding relation, it is determined whether exist corresponding with described pending data cached topic mark special Topic mark;
Store processing module in the recent period, if determining module to determine not to exist described special topic mark for described, then Described pending data cached described topic mark, described Data Identification are associated with described sort field Being stored in the Recent data storehouse of described server, described Recent data storehouse is used for storing described pending slow Deposit data is to Dai-ichi Mutual Life Insurance duration;
Historical storage processing module, for being stored in described server by pending data cached after extension In historical data base, described historical data base is pending data cached to for store after described extension Two life durations, the length of described second life duration is more than the length of described Dai-ichi Mutual Life Insurance duration, described Pending data cached after extension includes described pending data cached and described public sentiment data to be stored In other fields in addition to described displaying field and described sort field;
Real time storage and processing module, for preset the first topic storage format described pending caching of storage The described topic mark of data, described Data Identification and described establishment time are to the real-time number of described server According in storehouse;To preset the second described pending data cached described topic mark of topic storage format storage With in described displaying field to described real-time data base, described real-time data base is used for storing described pending Data cached to third life duration, the length of described third life duration is less than described Dai-ichi Mutual Life Insurance duration Length.
The public sentiment data that the present invention provides stores method and server, resolves public sentiment data, it is thus achieved that For carrying out the displaying field of needs when user shows and required for public sentiment data is analyzed Sort field, thus deposit after public sentiment data carries out topic detection treating, the most only by this public sentiment data Topic mark, Data Identification, displaying field and these information of sort field be stored in the caching of server In, and then again topic mark, Data Identification and the sort field of the public sentiment data stored in caching are stored in In Recent data storehouse, afterwards all information of this public sentiment data are stored in historical data base, then by this carriage The displaying field of feelings data and sort field are the most all stored in real-time data base, complete public sentiment data different Information is in the storage successively of Recent data storehouse, historical data base and real-time data base.Due to each data base There is different storage durations limit, it is achieved the classification to public sentiment data does not store, and, first will obtain Magnanimity public sentiment data carry out buffer memory, and then proceed to Recent data storehouse, historical data base and in real time The storage of data base, it is ensured that while data storing reliability, enters public sentiment data according to different demands Row is in real time, in the recent period and the storage respectively of history, it is simple to quickly accesses acquisition according to different demands and is stored in not With the public sentiment data in data base to be analyzed and to apply.
Accompanying drawing explanation
Fig. 1 is the flow chart of public sentiment data of the present invention storage embodiment of the method;
Fig. 2 is the structural representation of server example of the present invention.
Detailed description of the invention
Fig. 1 is the flow chart of public sentiment data of the present invention storage embodiment of the method, and the method can be by a use Perform in the server carrying out public sentiment data storage and analysis management, as it is shown in figure 1, the method tool Body includes:
Step 101, obtain public sentiment data to be stored, distribute Data Identification for described public sentiment data to be stored, And determine, according to default each topic expression formula, the topic mark that described public sentiment data to be stored is corresponding.
In the present embodiment, public sentiment data to be stored is that the public passes through the subscriber terminal equipment of oneself in the Internet The data carrying out the operations such as various public sentiment comment, forwarding on network and produce, server can be by existing The modes such as gripping tool obtain public sentiment data.The storage of public sentiment data processes for convenience, and server is Every public sentiment data one unique Data Identification of distribution, this Data Identification can be such as by public sentiment After data carry out word segmentation processing, each participle obtained is carried out what the Hash operation of preset algorithm obtained, It is not limited.
In the present embodiment, prestore experience or multiple topic expression formulas of statistics acquisition in the server, And each topic expression formula uniquely corresponding topic mark.Thus, server can be deposited by treating Storage public sentiment data carries out word segmentation processing, obtains each participle, by each topic expression formula to storage The word comprised mates, it is possible to obtains the topic expression formula that this public sentiment data to be stored is corresponding, i.e. obtains Topic that must be corresponding with this public sentiment data to be stored identifies.Wherein, described matching ratio mates i.e. the most completely Comprise words whole in certain topic expression formula, it is also possible to be a certain degree of coupling, the most such as overlap Word occupies the ratio of words whole in certain topic expression formula.
Step 102, parsing obtain displaying field corresponding to described public sentiment data to be stored and sort field, Described Data Identification, described topic mark, described displaying field are associated with described sort field and is stored in In the caching of server, obtain pending data cached;Wherein, described displaying field include described in wait to deposit Establishment time, founder and the data content of storage public sentiment data, described sort field includes described to be stored The hop count of public sentiment data and/or comment number of times.
Step 103, obtain from described caching described pending data cached, according to default special topic words Topic corresponding relation, it is determined whether there is the special topic mark corresponding with described pending data cached topic mark Knowing, if there is described special topic mark, then performing step 104-107;If there is not described special topic mark, then Perform step 105-107.
In the present embodiment, a public sentiment data may include a lot of information, such as except data content Outside, also include the founder of this public sentiment data, the establishment time, comment number of times, hop count, Published method etc. much information.And the storage meaning of these public sentiment data is usually, by real-time Or the statistics of the public sentiment data of a period of time, analysis, with obtain focus incident that the current public paid close attention to or Viewpoint trend, in order to the mechanisms such as government reasonably guide, it is to avoid cause serious social influence, or Scan for engine for users such as ICPs or message recommends to be used.Therefore, for above-mentioned The different application occasion of citing, in the present embodiment, in order to complete magnanimity public sentiment data in time, efficiently, While reliable memory, the public sentiment data also for storage can facilitate follow-up different analysis demand, clothes Business device is obtaining after public sentiment data, by public sentiment data is resolved, therefrom obtain show field and Sort field.Wherein, show that field mainly includes the establishment time of public sentiment data the most to be stored, establishment Person and data content, sort field includes hop count and/or the comment number of times of public sentiment data to be stored.Exhibition Show field to be mainly used in user and show a certain topic in real time or in a period of time or each public affairs of a certain special topic Many viewpoint i.e. public sentiment data contents, sort field is mainly used in analysis of central issue.
After resolving the displaying field and sort field obtaining public sentiment data to be stored, by this public sentiment to be stored The Data Identification of data, topic mark, displaying field associate the caching being stored in server with sort field In, obtain pending data cached.Owing to may need to analyze storage significant amount within the same time period Public sentiment data, and get public sentiment data to be stored to by this public sentiment data to be stored storage complete, Processing procedure is longer, in order to alleviate the processing pressure of subsequent storage reason process, by public sentiment data follow-up Storage is first stored in the caching of server before processing.So it is also an advantage that be exactly when after public sentiment data Public sentiment data is just deleted after processing successfully from server buffer by phase storage, processes when the public sentiment data later stage After failure, it is not necessary to do any operation, only need to read public sentiment data existing in caching and carry out processing, The most both can be greatly simplified handling process, in turn ensure that the integrity of data.
Afterwards, to pending data cached, during i.e. association is stored in the caching of server present in caching The Data Identification of public sentiment data to be stored, topic mark, show field and sort field, carry out follow-up Storage processes.
In described subsequent storage reason, it is necessary first to carry out pending data cached thematic warehouse-in and process. Specifically, according in server storage preset thematic topic corresponding relation, it is determined whether exist with The special topic mark that described pending data cached topic mark is corresponding.It practice, special topic, topic and carriage There is uncertain relation between feelings data, i.e. one topic may include multiple public sentiment data, and one Individual special topic may correspond to multiple different topic, in the present embodiment, and can be according to adding up the special of acquisition in advance Topic determines the special topic mark that currently pending data cached topic mark is corresponding with the corresponding relation of topic Whether exist.
Step 104, with preset the 3rd special topic storage format storage described pending data cached described specially Topic mark, in described Data Identification and described establishment time to described real-time data base;Special to preset the 4th The described pending data cached described special topic mark of topic storage format storage and described displaying field are to described In real-time data base.
If there is the special topic mark corresponding with pending data cached topic mark, then carry out pending slow The special topic of deposit data enters the process of real-time data base.What deserves to be explained is, in the present embodiment, at server In be provided with three kinds of data bases: real-time data base, Recent data storehouse and historical data base, wherein, described Real-time data base is positioned in the internal memory of described server;Described Recent data storehouse is relevant database;Institute Stating historical data base is non-relational NoSQL data base.Wherein, real-time data base is used for storing pending Data cached certain life duration, such as from certain pending data cached be stored in real-time data base time Between start at, store the time of one week, week age is automatically deleted this data when arriving.
Specifically, by pending data cached carry out special topic enter real-time data base storage during, this Embodiment provides two kinds of storage formats store respectively, the respectively the 3rd special topic storage format and the Four special topic storage formats.Wherein, this pending data cached special topic is stored with the 3rd special topic storage format Mark, in Data Identification and establishment time to real-time data base, concrete form visual representation is: (special topic mark Knowledge-Data Identification, creates the time);This is stored pending data cached special with the 4th special topic storage format Topic mark and show field in real-time data base, concrete form visual representation is: (special topic mark, list (displaying field)).Wherein, list is the implication of list, and its implication refers to belong to a special topic mark Some pending data cached displaying field be respectively written into successively in this list list.Wherein, this two Planting storage format and be respectively used to different purposes, the third special topic storage format is used for sentencing weight and eliminating, i.e. In order to avoid same pending data cached repeat process, by repeat process pending data cached from Caching is deleted;4th kind of thematic storage format is used for showing special topic situation in real time, referring in real time now There is the real-time of certain period of time implication.It addition, data thematic information is only stored in real-time data base, This partial information is to show user in order to quick-searching goes out certain special topic related data.
After pending data cached special topic enters real-time data base success, or talk about with it determining not exist After the special topic mark that topic mark is corresponding, perform following subsequent step, i.e. carry out pending data cached Topic warehouse-in processes.
Step 105, by described pending data cached described topic mark, described Data Identification and institute Stating sort field association to be stored in the Recent data storehouse of described server, described Recent data storehouse is used for depositing Store up described pending data cached to Dai-ichi Mutual Life Insurance duration.
In the present embodiment, use first Recent data storehouse, then historical data base, then the storage of real-time data base Order stores pending data cached topic information successively.
First, identified by currently pending data cached topic, Data Identification associates with sort field and deposits Storage is in the Recent data storehouse of server, and wherein, concrete storage format can visual representation be: (topic mark Knowledge-Data Identification, sort field).This Recent data storehouse is used for storing pending data cached raw to first Life duration, such as 1 month.Wherein, in this Recent data storehouse, the topic information spinner of storage to be used for analyzing It is used.Recent data storehouse only stores a pending data cached part and analyzes field i.e. sort field, Do not store the details of data.
Step 106, will extension after the pending data cached historical data base being stored in described server in, Described historical data base is pending data cached to the second life duration, institute for store after described extension State the length of the second life duration length more than described Dai-ichi Mutual Life Insurance duration, pending after described extension Data cached include in described pending data cached and described public sentiment data to be stored except described displaying word Other fields outside section and described sort field.
Secondly, by the above-mentioned sort field of public sentiment data to be stored, show field, and except showing field Other all or part of fields with outside sort field, are stored in the historical data base of server.Wherein, Historical data base is used for storing above-mentioned public sentiment data to the second life duration, the length of described second life duration Degree, more than the length of described Dai-ichi Mutual Life Insurance duration, is such as whole life cycle.
Data in above-mentioned Recent data storehouse and historical data base are served only for analyzing, and the angle of problem analysis is Centered by topic, belonging to which special topic for certain topic is useless in analysis, when to When user shows analysis results, the affiliated topic for special topic can directly obtain from special topic topic corresponding relation ?.
Step 107, with preset first topic storage format store described pending data cached described words Topic mark, in described Data Identification and described establishment time to the real-time data base of described server;With in advance If the second described pending data cached described topic mark of topic storage format storage and described displaying word Section is in described real-time data base, and described real-time data base is used for storing described pending data cached to the Three life durations, the length of described third life duration is less than the length of described Dai-ichi Mutual Life Insurance duration.
Finally, carry out topic data information and enter the process of real-time data base.Specifically, it is provided that two kinds Storage format carries out topic process: the first topic storage format and the second topic storage format.Wherein, with First topic storage format storage pending data cached topic mark, Data Identification and establishment time arrive In the real-time data base of server, concrete form visual representation is: (topic mark-Data Identification creates Time);Identify with the second pending data cached topic of topic storage format storage and show that field is to real Time data base in, concrete form visual representation is: (topic identify, list (displaying field)).Wherein, list Being the implication of list, its implication refers to belong to the some pending data cached of a topic mark Show that field is respectively written in this list list successively.Described real-time data base is used for storing described pending Data cached to third life duration, the length of described third life duration is less than above-mentioned Dai-ichi Mutual Life Insurance duration Length, be such as one week.
Wherein, both topic storage formats are respectively used to different purposes, and with kind of a thematic storage format For sentencing weight and eliminating, i.e. repeat to process in order to avoid same pending data cached topic information, It is removed from the cache repeating the pending data cached of process;The second special topic storage format is used for showing In real time topic situation, now refer to that there is the real-time of certain period of time implication in real time.
In the present embodiment, public sentiment data is resolved, it is thus achieved that for carrying out needs when user shows Show field, and for public sentiment data being analyzed required sort field, thus deposit carriage treating After feelings data carry out topic detection, the most only by the topic mark of this public sentiment data, Data Identification, displaying Field and these information of sort field are stored in the caching of server, and then the carriage that will store in caching again Topic mark, Data Identification and the sort field of feelings data are stored in Recent data storehouse, afterwards by this public sentiment All information of data are stored in historical data base, then by the displaying field of this public sentiment data and sort field The most all it is stored in real-time data base, completes public sentiment data difference information in Recent data storehouse, historical data Storehouse and the storage successively of real-time data base.Limit owing to each data base has different storage durations, real The now classification to public sentiment data does not store, and, first the magnanimity public sentiment data of acquisition is carried out buffer memory, And then proceeding to the storage of Recent data storehouse, historical data base and real-time data base, it is ensured that data store While reliability, according to different demands public sentiment data carried out real-time, in the recent period and the storage respectively of history, It is easy to quickly access according to different demands obtain the public sentiment data being stored in disparate databases to be analyzed And application.
Optionally, obtain from described caching in above-mentioned steps 103 described pending data cached after, Also comprise the processing steps of:
Determine and whether described real-time data base exists and described pending data cached described Data Identification The list item corresponding with described topic mark;If existing, then delete described pending data cached.The most above-mentioned The purposes of the topic information of the first topic storage format storage embodies.If real-time data base has existed Certain topic mark and certain Data Identification, illustrate that this data is the most processed, it is not necessary to repeats to process.
It addition, topic entered after real-time data base is disposed in step 107, will corresponding in caching Pending data cached deletion, and carry out next data cached processing procedure.
Fig. 2 is the structural representation of server example of the present invention, as in figure 2 it is shown, this server includes:
Acquisition module 11, is used for obtaining public sentiment data to be stored, distributes number for described public sentiment data to be stored According to mark, and determine, according to default each topic expression formula, the topic mark that described public sentiment data to be stored is corresponding Know;
Buffer process module 12, for resolve obtain displaying field corresponding to described public sentiment data to be stored and Sort field, by described Data Identification, described topic mark, described displaying field and described sort field Association is stored in the caching of server, obtains pending data cached;Wherein, described displaying field bag Including establishment time, founder and the data content of described public sentiment data to be stored, described sort field includes The hop count of described public sentiment data to be stored and/or comment number of times;
Determine module 13, described pending data cached, according to default for obtaining from described caching Special topic topic corresponding relation, it is determined whether exist corresponding with described pending data cached topic mark Special topic mark;
In the recent period storage processing module 14, if determining module 13 to determine not to there is described special topic mark for described Know, then by described pending data cached described topic mark, described Data Identification and described sequence word Duan Guanlian is stored in the Recent data storehouse of described server, described Recent data storehouse be used for storing described in treat Process data cached to Dai-ichi Mutual Life Insurance duration;
Historical storage processing module 15, for pending data cached being stored in described server after extension Historical data base in, pending data cached after store described extension arrives described historical data base Second life duration, the length of described second life duration is more than the length of described Dai-ichi Mutual Life Insurance duration, institute That states after extension pending data cached includes described pending data cached and described public sentiment number to be stored Other fields in addition to described displaying field and described sort field according to;
Real time storage and processing module 16, for described pending slow to preset the first topic storage format storage Real-time to described server of the described topic mark of deposit data, described Data Identification and described establishment time In data base;To preset the second topic storage format described pending data cached described topic mark of storage Know and described displaying field be in described real-time data base, described real-time data base be used for storing described in wait to locate Manage data cached to third life duration, when the length of described third life duration is less than described Dai-ichi Mutual Life Insurance Long length.
Optionally, described determine that module 13 is additionally operable to:
Determine and whether described real-time data base exists and described pending data cached described Data Identification The list item corresponding with described topic mark;
Described server also includes:
Removing module 17, if determine that module 13 determines there is described list item for described, then deletes described Pending data cached.
Further, described real time storage and processing module 16 is additionally operable to:
Determine that module determines that there is described special topic identifies if described, then deposit with default 3rd special topic storage format Store up described pending data cached described special topic mark, described Data Identification and described establishment time to institute State in real-time data base;With preset the 4th special topic storage format storage described pending data cached described in Special topic mark and described displaying field are in described real-time data base.
Further, described removing module 17 is additionally operable to:
Delete described pending data cached from described caching.
Wherein, during described real-time data base is positioned at the internal memory of described server;Described Recent data storehouse is for closing It is type data base;Described historical data base is non-relational NoSQL data base.
The device of the present embodiment may be used for performing the technical scheme of embodiment of the method shown in Fig. 1, and it realizes Principle is similar with technique effect, and here is omitted.
One of ordinary skill in the art will appreciate that: realize all or part of step of said method embodiment Can be completed by the hardware that programmed instruction is relevant, aforesaid program can be stored in a computer-readable Taking in storage medium, this program upon execution, performs to include the step of said method embodiment;And it is aforementioned Storage medium include: various Jie that can store program code such as ROM, RAM, magnetic disc or CD Matter.
Last it is noted that various embodiments above is only in order to illustrate technical scheme, rather than right It limits;Although the present invention being described in detail with reference to foregoing embodiments, this area common Skilled artisans appreciate that the technical scheme described in foregoing embodiments still can be modified by it, Or the most some or all of technical characteristic is carried out equivalent;And these amendments or replacement, and The essence not making appropriate technical solution departs from the scope of various embodiments of the present invention technical scheme.

Claims (10)

1. a public sentiment data storage method, it is characterised in that including:
Obtain public sentiment data to be stored, distribute Data Identification for described public sentiment data to be stored, and according in advance If each topic expression formula determine described public sentiment data to be stored corresponding topic mark;
Resolve and obtain displaying field corresponding to described public sentiment data to be stored and sort field, by described data Mark, described topic mark, described displaying field associate with described sort field and are stored in the slow of server In depositing, obtain pending data cached;Wherein, described displaying field includes described public sentiment data to be stored Establishment time, founder and data content, described sort field includes described public sentiment data to be stored Hop count and/or comment number of times;
Obtain described pending data cached from described caching, according to default thematic topic corresponding relation, Determine whether there is the special topic mark corresponding with described pending data cached topic mark;
If there is not described special topic mark, then by described pending data cached described topic mark, institute State Data Identification to associate with described sort field in the Recent data storehouse being stored in described server, described closely Phase data base is used for storing described pending data cached to Dai-ichi Mutual Life Insurance duration;
In the pending data cached historical data base being stored in described server after extending, described history Data base is pending data cached to the second life duration for store after described extension, and described second is raw The length of life duration more than the length of described Dai-ichi Mutual Life Insurance duration, pending data cached after described extension Include in described pending data cached and described public sentiment data to be stored except described displaying field and described Other fields outside sort field;
To preset the first described pending data cached described topic mark of topic storage format storage, institute State Data Identification and described establishment time in the real-time data base of described server;To preset the second topic The described pending data cached described topic mark of storage format storage and described displaying field are to described reality Time data base in, described real-time data base is used for storing described pending data cached to third life duration, The length of described third life duration is less than the length of described Dai-ichi Mutual Life Insurance duration.
Method the most according to claim 1, it is characterised in that described acquisition institute from described caching State pending data cached after, also include:
Determine and whether described real-time data base exists and described pending data cached described Data Identification The list item corresponding with described topic mark;
If existing, then delete described pending data cached.
Method the most according to claim 1, it is characterised in that described in determine whether there is with described After the special topic mark that pending data cached topic mark is corresponding, also include:
If there is described special topic mark, then to preset the 3rd special topic storage format described pending caching of storage The described special topic mark of data, in described Data Identification and described establishment time to described real-time data base; To preset the 4th described pending data cached described special topic mark of special topic storage format storage and described exhibition Show that field is in described real-time data base.
Method the most according to claim 1, it is characterised in that described to preset the first topic storage When the described pending data cached described topic mark of form storage, described Data Identification and described establishment Between in the real-time data base of described server;Described pending to preset the second topic storage format storage Data cached described topic mark and described displaying field, to after in described real-time data base, also include:
Delete described pending data cached from described caching.
Method the most according to any one of claim 1 to 4, it is characterised in that described real-time number According to warehouse compartment in the internal memory of described server;Described Recent data storehouse is relevant database;Described history Data base is non-relational NoSQL data base.
6. a server, it is characterised in that including:
Acquisition module, is used for obtaining public sentiment data to be stored, distributes data for described public sentiment data to be stored Mark, and determine, according to default each topic expression formula, the topic mark that described public sentiment data to be stored is corresponding;
Buffer process module, obtains displaying field corresponding to described public sentiment data to be stored and row for resolving Sequence field, closes described Data Identification, described topic mark, described displaying field and described sort field Connection is stored in the caching of server, obtains pending data cached;Wherein, described displaying field includes Establishment time, founder and the data content of described public sentiment data to be stored, described sort field includes institute State hop count and/or the comment number of times of public sentiment data to be stored;
Determine module, described pending data cached for obtaining from described caching, according to default special Topic topic corresponding relation, it is determined whether exist corresponding with described pending data cached topic mark special Topic mark;
Store processing module in the recent period, if determining module to determine not to exist described special topic mark for described, then Described pending data cached described topic mark, described Data Identification are associated with described sort field Being stored in the Recent data storehouse of described server, described Recent data storehouse is used for storing described pending slow Deposit data is to Dai-ichi Mutual Life Insurance duration;
Historical storage processing module, for being stored in described server by pending data cached after extension In historical data base, described historical data base is pending data cached to for store after described extension Two life durations, the length of described second life duration is more than the length of described Dai-ichi Mutual Life Insurance duration, described Pending data cached after extension includes described pending data cached and described public sentiment data to be stored In other fields in addition to described displaying field and described sort field;
Real time storage and processing module, for preset the first topic storage format described pending caching of storage The described topic mark of data, described Data Identification and described establishment time are to the real-time number of described server According in storehouse;To preset the second described pending data cached described topic mark of topic storage format storage With in described displaying field to described real-time data base, described real-time data base is used for storing described pending Data cached to third life duration, the length of described third life duration is less than described Dai-ichi Mutual Life Insurance duration Length.
Server the most according to claim 6, it is characterised in that described determine that module is additionally operable to:
Determine and whether described real-time data base exists and described pending data cached described Data Identification The list item corresponding with described topic mark;
Also include:
Removing module, if determine that module determines there is described list item for described, then deletes described pending Data cached.
Server the most according to claim 6, it is characterised in that described real time storage and processing module It is additionally operable to:
Determine that module determines that there is described special topic identifies if described, then deposit with default 3rd special topic storage format Store up described pending data cached described special topic mark, described Data Identification and described establishment time to institute State in real-time data base;With preset the 4th special topic storage format storage described pending data cached described in Special topic mark and described displaying field are in described real-time data base.
Server the most according to claim 6, it is characterised in that described removing module is additionally operable to:
Delete described pending data cached from described caching.
10. according to the server according to any one of claim 6 to 9, it is characterised in that described reality Time data base be positioned in the internal memory of described server;Described Recent data storehouse is relevant database;Described Historical data base is non-relational NoSQL data base.
CN201510111930.9A 2015-03-13 2015-03-13 Public sentiment data storage method and server Expired - Fee Related CN106033438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510111930.9A CN106033438B (en) 2015-03-13 2015-03-13 Public sentiment data storage method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510111930.9A CN106033438B (en) 2015-03-13 2015-03-13 Public sentiment data storage method and server

Publications (2)

Publication Number Publication Date
CN106033438A true CN106033438A (en) 2016-10-19
CN106033438B CN106033438B (en) 2019-06-04

Family

ID=57150686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510111930.9A Expired - Fee Related CN106033438B (en) 2015-03-13 2015-03-13 Public sentiment data storage method and server

Country Status (1)

Country Link
CN (1) CN106033438B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205543A (en) * 2016-12-16 2018-06-26 北京酷我科技有限公司 A kind of song information storage method and system
CN110019556A (en) * 2017-12-27 2019-07-16 阿里巴巴集团控股有限公司 A kind of topic news acquisition methods, device and its equipment
CN110110250A (en) * 2018-01-18 2019-08-09 北京京东尚科信息技术有限公司 Information output method and device
CN110515895A (en) * 2019-08-30 2019-11-29 弭迺彬 The method and system of storage are associated in big data storage system to data file
CN111897819A (en) * 2020-07-31 2020-11-06 平安普惠企业管理有限公司 Data storage method and device, electronic equipment and storage medium
CN112860750A (en) * 2021-03-11 2021-05-28 广州市网星信息技术有限公司 Data processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100205176A1 (en) * 2009-02-12 2010-08-12 Microsoft Corporation Discovering City Landmarks from Online Journals
CN102110102A (en) * 2009-12-29 2011-06-29 北大方正集团有限公司 Data processing method and device, and file identifying method and tool
CN103092950A (en) * 2013-01-15 2013-05-08 重庆邮电大学 Online public opinion geographical location real time monitoring system and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100205176A1 (en) * 2009-02-12 2010-08-12 Microsoft Corporation Discovering City Landmarks from Online Journals
CN102110102A (en) * 2009-12-29 2011-06-29 北大方正集团有限公司 Data processing method and device, and file identifying method and tool
CN103092950A (en) * 2013-01-15 2013-05-08 重庆邮电大学 Online public opinion geographical location real time monitoring system and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
代平等: "一种嵌入式雨水情数据存储解决方案", 《水电能源科学》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205543A (en) * 2016-12-16 2018-06-26 北京酷我科技有限公司 A kind of song information storage method and system
CN110019556A (en) * 2017-12-27 2019-07-16 阿里巴巴集团控股有限公司 A kind of topic news acquisition methods, device and its equipment
CN110019556B (en) * 2017-12-27 2023-08-15 阿里巴巴集团控股有限公司 Topic news acquisition method, device and equipment thereof
CN110110250A (en) * 2018-01-18 2019-08-09 北京京东尚科信息技术有限公司 Information output method and device
CN110110250B (en) * 2018-01-18 2024-09-24 北京京东尚科信息技术有限公司 Information output method and device
CN110515895A (en) * 2019-08-30 2019-11-29 弭迺彬 The method and system of storage are associated in big data storage system to data file
CN110515895B (en) * 2019-08-30 2023-06-23 北京燕山电子设备厂 Method and system for carrying out associated storage on data files in big data storage system
CN111897819A (en) * 2020-07-31 2020-11-06 平安普惠企业管理有限公司 Data storage method and device, electronic equipment and storage medium
CN112860750A (en) * 2021-03-11 2021-05-28 广州市网星信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN112860750B (en) * 2021-03-11 2023-11-17 广州市网星信息技术有限公司 Data processing method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN106033438B (en) 2019-06-04

Similar Documents

Publication Publication Date Title
CN106033438A (en) Public sentiment data storage method and server
CN104750469B (en) Source code statistical analysis technique and system
CN102214187B (en) Complex event processing method and device
US20180314854A1 (en) Event processing system
CN105095211B (en) The acquisition methods and device of multi-medium data
US20170279840A1 (en) Automated event id field analysis on heterogeneous logs
US9384473B2 (en) Methods and systems for creating online unified contact and communication management (CM) platform
CN106326381A (en) HBase data retrieval method based on MapDB construction
CN105556552A (en) Fraud detection and analysis
CN111459985A (en) Identification information processing method and device
CN106022708A (en) Method for predicting employee resignation
CN104423982B (en) The processing method and processing equipment of request
CN107515878A (en) The management method and device of a kind of data directory
WO2019076001A1 (en) Information updating method and device
CN106407442B (en) A kind of mass text data processing method and device
CN106126634B (en) A kind of master data duplicate removal treatment method and system based on live streaming industry
US8965879B2 (en) Unique join data caching method
CN107577787A (en) The method and system of associated data information storage
CN107832323A (en) A kind of distributed implementation system and method based on chart database
CN110019694A (en) Method, apparatus and computer readable storage medium for knowledge mapping
US8839449B1 (en) Assessing risk of information leakage
CN103914487A (en) Document collection, identification and association system
CN104699790B (en) A kind of bank data relation establishing method and device
US10055469B2 (en) Method and software for retrieving information from big data systems and analyzing the retrieved data
CN107609011A (en) The maintaining method and device of a kind of data-base recording

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220624

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: Peking University

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District, Cheng Fu Road, No. 298, Zhongguancun Fangzheng building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: Peking University

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190604