CN111079010B - Data processing method, device and system - Google Patents

Data processing method, device and system Download PDF

Info

Publication number
CN111079010B
CN111079010B CN201911274881.5A CN201911274881A CN111079010B CN 111079010 B CN111079010 B CN 111079010B CN 201911274881 A CN201911274881 A CN 201911274881A CN 111079010 B CN111079010 B CN 111079010B
Authority
CN
China
Prior art keywords
keyword
preset
target
data
keywords
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911274881.5A
Other languages
Chinese (zh)
Other versions
CN111079010A (en
Inventor
冯泽亮
祝捷
王雯雯
李薇
赵坤
杨龙
张晓丽
王滋怡
秦刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Sichuan Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Sichuan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Sichuan Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201911274881.5A priority Critical patent/CN111079010B/en
Publication of CN111079010A publication Critical patent/CN111079010A/en
Application granted granted Critical
Publication of CN111079010B publication Critical patent/CN111079010B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a data processing method, a device and a system, wherein the method comprises the following steps: sending text theme data to a plurality of clients, and receiving text reply data corresponding to the text theme data fed back by each client; determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword; acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword; and determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords.

Description

Data processing method, device and system
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, and system.
Background
In the related art, when content recommendation is performed for a user, there is a drawback in that there is a balance between recommendation accuracy and calculation speed. The drawbacks are more pronounced in particular technical and application areas, due to the obvious characteristics of the areas and the relative scarcity of allocable computing resources.
Disclosure of Invention
In view of the foregoing problems, the present invention provides a data processing method, apparatus and corresponding system.
According to a first aspect of the embodiments of the present invention, there is provided a data processing method for a server, including:
sending text topic data to a plurality of clients, and receiving text reply data corresponding to the text topic data fed back by each client;
determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword;
acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword;
and determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords.
In one embodiment, preferably, determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword includes:
performing word segmentation processing on each text reply data to obtain a plurality of target keywords;
and determining a one-dimensional weight value corresponding to each target keyword according to the occurrence frequency of each target keyword in the text reply data.
In one embodiment, preferably, after obtaining the plurality of target keywords, the method further includes:
acquiring a keyword storage word bank, wherein a plurality of preset keywords and coupling degrees among different preset keywords are stored in the keyword storage word bank;
determining whether a first target keyword and a second target keyword which can be combined exist in the plurality of target keywords according to the coupling degree between the different preset keywords;
when a first target keyword and a second target keyword which can be combined exist, combining the first target keyword and the second target keyword.
In one embodiment, preferably, determining whether there are a first target keyword and a second target keyword that can be merged in the plurality of target keywords according to the coupling degree between the different preset keywords includes:
acquiring a target preset keyword pair with the coupling degree within a preset range from the keyword storage word library;
judging whether the target preset keyword pair exists in the target keywords or not;
and when the target preset keyword pairs exist in the target keywords, determining that a first keyword and a second keyword which can be combined exist in the target keywords.
In one embodiment, preferably, the method further comprises:
displaying preset keywords in the keyword storage word library in a preset display mode;
receiving a merging processing operation which is input by a user and is used for merging a first preset keyword and a second preset keyword, merging and displaying the first preset keyword and the second preset keyword according to the merging processing operation, and adding 1 to the coupling degree of the first preset keyword and the second preset keyword in a keyword storage word library; or alternatively
Receiving a separation processing operation which is input by a user and separates a first preset keyword and a second preset keyword which are combined and displayed, separating and displaying the first preset keyword and the second preset keyword according to the separation processing operation, and reducing the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library by 1.
In one embodiment, preferably, the method further comprises:
when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is greater than a first preset threshold value, the first preset keyword and the second preset keyword are combined and displayed;
and when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is smaller than a second preset threshold value, the first preset keyword and the second preset keyword are displayed in a separated mode.
In one embodiment, preferably, determining target preset reference text data with the highest correlation degree with the text reply data according to the target keyword and the one-dimensional weight value corresponding to the target keyword, and the reference keyword and the one-dimensional weight value corresponding to the reference keyword comprises:
calculating the correlation degree between the text reply data and the preset reference text data according to the target keyword and the one-dimensional weight value corresponding to the target keyword as well as the reference keyword and the one-dimensional weight value corresponding to the reference keyword;
and determining the preset reference text data with the highest correlation degree as the target preset reference text data.
In one embodiment, preferably, the acquiring a plurality of preset reference text data includes:
storing preset multimedia data with different formats;
and converting the preset multimedia data with different formats into text data, and taking the text data as the preset reference text data.
According to a second aspect of an embodiment of the present invention, there is provided a data processing apparatus for a server, including:
a memory and a processor;
the memory is used for storing data used when the processor executes a computer program;
the processor is configured to execute a computer program to implement the method as described in the first aspect or any embodiment of the first aspect.
According to a third aspect of embodiments of the present invention, there is provided a data processing system including:
a server;
a plurality of clients coupled with the server;
the server sends text topic data to a plurality of clients, receives text reply data corresponding to the text topic data fed back by each client, and determines a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword; acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword; and determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords.
In the embodiment of the invention, the server sends the text topic data to a plurality of clients, the clients return the corresponding text reply data to the server after processing, the server determines the keywords and the corresponding weights of the text reply data, and then compares the keywords and the weights with the keywords and the weights of the preset reference text data to further determine the target preset reference text data with the highest relevance with the text reply data, so that the text topic data and the like are sent to the clients through the server, and the reference text data with the highest relevance with the feedback result of the user is selected from a plurality of candidate reference text data according to the feedback results of most users, thereby improving the accuracy of content recommendation on the basis of less consumption of computing resources.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 shows a flow diagram of a data processing method according to one embodiment of the invention.
Fig. 2A shows a flow diagram of a data processing method according to another embodiment of the invention.
FIG. 2B shows a flow diagram of a data processing method according to yet another embodiment of the invention.
Fig. 3 shows a flowchart of step S202 in a data processing method according to another embodiment of the present invention.
Fig. 4 shows a flow diagram of a data processing method according to a further embodiment of the invention.
Fig. 5 shows a flow diagram of a data processing method according to yet another embodiment of the invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor do they limit the types of "first" and "second".
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows a flow diagram of a data processing method according to an embodiment of the invention.
As shown in fig. 1, a data processing method according to an embodiment of the present invention is for a server, the data processing method including steps S101-S104:
step S101, sending text theme data to a plurality of clients, and receiving text reply data corresponding to the text theme data fed back by each client. Wherein, those skilled in the art can understand that the text topic data can be any text type data which can be processed or analyzed by using a text processing program, such as data in txt, bat, cvs, xml, and the like formats, and those skilled in the art can also understand that the text type data can be widely applied to various scenes of the internet, including but not limited to social networks, topic forums, comment areas of APP application stores, electronic questionnaires, and the like. Any specific text type of data and specific form of application scenario described above will fall within the scope of the present invention. Meanwhile, unless otherwise specified, "text", "text data", and/or "text subject data" in the present invention are in accordance with the explanations of the above meanings. After the server sends the data to the plurality of clients, the clients can present the data to the user, the user performs feedback and the like to obtain text reply data, and the text reply data is returned to the server.
Step S102, a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword are determined.
In one embodiment, preferably, determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword includes:
performing word segmentation processing on each text reply data to obtain a plurality of target keywords;
and determining a one-dimensional weight value corresponding to each target keyword according to the occurrence frequency of each target keyword in the text reply data.
Step S103, a plurality of preset reference text data are obtained, and a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword are determined.
In one embodiment, preferably, the obtaining a plurality of preset reference text data includes:
storing preset multimedia data with different formats; the multimedia data may be text data, video data, audio data, etc.
And converting preset multimedia data with different formats into text data, and taking the text data as preset reference text data. The conversion of multimedia data to text data can be accomplished by any means known in the art. For example, audio data is converted into text data by an audio-text converter including companies such as the company of the science university flyer, and audio or subtitles are extracted from video data and converted into text data.
And step S104, determining target preset reference text data with highest relevance with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords, the reference keywords and the one-dimensional weight values corresponding to the reference keywords.
In one embodiment, preferably, determining target preset reference text data with the highest relevance to the text reply data according to the target keyword and the one-dimensional weight value corresponding to the target keyword, and the reference keyword and the one-dimensional weight value corresponding to the reference keyword, includes:
calculating the correlation between the text reply data and preset reference text data according to the target keyword and the one-dimensional weight value corresponding to the target keyword as well as the reference keyword and the one-dimensional weight value corresponding to the reference keyword; and determining the preset reference text data with the highest correlation degree as target preset reference text data.
The correlation between the text reply data and the preset reference text data can be calculated by calculating the cosine distance between the target keyword and the reference keyword, and of course, other correlation calculation methods known in the related art can also be adopted for calculation.
In the embodiment, the server sends the text topic data to the plurality of clients, the clients return the corresponding text reply data to the server after processing, the server determines keywords and corresponding weights of the text reply data, and then compares the keywords and the weights with those of the preset reference text data to determine the target preset reference text data with the highest relevance to the text reply data.
Fig. 2 shows a flow diagram of a data processing method according to another embodiment of the invention.
As shown in fig. 2, in one embodiment, preferably, after obtaining the plurality of target keywords, the method further includes steps S201-S203:
step S201, a keyword storage lexicon is obtained, wherein the keyword storage lexicon stores a plurality of preset keywords and coupling degrees between different preset keywords.
Step S202, determining whether a first target keyword and a second target keyword which can be combined exist in the plurality of target keywords according to the coupling degree between different preset keywords.
Step S203, when the first target keyword and the second target keyword which can be combined exist, combining the first target keyword and the second target keyword.
In this embodiment, a keyword storage lexicon may be set, and some replaceable keywords, such as words with the same meaning, similar meaning, or opposite meaning, may be stored in the lexicon, so that when the target keyword and the weight are calculated, the words with the same meaning, similar meaning, or opposite meaning may be used as one keyword for calculation, thereby improving accuracy and efficiency.
Fig. 3 shows a flowchart of step S202 in a data processing method according to another embodiment of the present invention.
As shown in fig. 3, in one embodiment, the step S202 preferably includes steps S301 to S303:
step S301, acquiring a target preset keyword pair with the coupling degree within a preset range from a keyword storage word library; the preset keyword pair is two preset keywords.
Step S302, judging whether a target preset keyword pair exists in a plurality of target keywords;
step S303, when a target preset keyword pair exists in the target keywords, determining that a first keyword and a second keyword which can be combined exist in the target keywords.
For example, the preset keyword pair is the keywords "clean" and "clean", and in the target keyword, if there are two keywords, the two keywords can be merged into one target keyword, and then the weights of the two keywords are counted together. According to the present invention, steps S301-S303 may be performed by a computer program loop until there are no merged first and second keywords.
Fig. 4 shows a flow diagram of a data processing method according to a further embodiment of the invention.
As shown in fig. 4, in one embodiment, preferably, the method further includes steps S401 to S403:
step S401, displaying preset keywords in the keyword storage word library in a preset display mode. The keywords in the keyword storage lexicon can be stored in a graph form, each keyword corresponds to one node in the graph, and the coupling degree between any two keywords is stored on the edge between any two nodes.
Step S402, receiving a merging processing operation which is input by a user and merges a first preset keyword and a second preset keyword, merging and displaying the first preset keyword and the second preset keyword according to the merging processing operation, and adding 1 to the coupling degree of the first preset keyword and the second preset keyword in a keyword storage lexicon; or
Step S403, receiving a separation processing operation input by the user to separate the merged and displayed first preset keyword from the second preset keyword, separately displaying the first preset keyword and the second preset keyword according to the separation processing operation, and subtracting 1 from the coupling degree of the first preset keyword and the second preset keyword in the keyword storage lexicon.
The preset keywords can be displayed in a histogram form, one preset keyword corresponds to one histogram, a user can check the histogram of the preset keywords and can edit the histogram, for example, when the user judges that words appearing in two columnar bars in a view are replaceable words, the two words can be combined and presented in the view in a dragging mode, meanwhile, a keyword storage word bank is updated, and the coupling degree value of the two words is +1; when the user judges that two words presented in the same columnar bar in the view are not different from similar words, the two words can be separately presented in the view in a dragging mode, meanwhile, a keyword storage word bank is updated, and the value of the coupling degree of the two words is-1. In a further embodiment, the height of the histogram represents the number of the merged preset keywords, and the text of the preset keywords is displayed above the histogram.
As shown in fig. 2B, in another preferred embodiment, after the step S203, the method further includes:
and step S204, displaying the target keywords in a preset display mode, and processing the target keywords according to the input of the user.
The target keywords may be displayed in the form of a histogram, one target keyword or one combined target keyword corresponds to one histogram, and the height of the histogram is the weight of the target keyword, and is optionally displayed above or on the histogram. If the histogram corresponds to the independent target keyword, the target keyword is displayed above the histogram; and if the histogram corresponds to the merged target keyword, displaying the first target keyword and the second target keyword before merging above the histogram. The user can view the histogram of the target keyword or edit it. If the user judges that the words appearing in the two columnar bars in the view are replaceable words, the two words can be merged and presented in the view in a dragging mode, the height of the merged histogram is the sum of the weights of the two words, the keyword storage word bank is updated at the same time, and the coupling degree value of the two words is +1; when the user judges that two words presented in the same columnar bar in the view are not replaceable words, the two words can be separately presented in the view in a dragging mode, the heights of the two separated columnar diagrams are the weights of the two words respectively, meanwhile, the keyword storage word bank is updated, and the coupling degree value of the two words is-1. Through the step S204, the target keyword can be more accurate, and the calculation of the relevancy can be more accurate.
Fig. 5 shows a flow diagram of a data processing method according to yet another embodiment of the invention.
As shown in fig. 5, in one embodiment, preferably, the method further includes steps S501 to S502:
step S501, when the coupling degree of a first preset keyword and a second preset keyword in a keyword storage lexicon is larger than a first preset threshold value, the first preset keyword and the second preset keyword are merged and displayed;
step S502, when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage lexicon is smaller than a second preset threshold value, the first preset keyword and the second preset keyword are displayed in a separated mode.
In this embodiment, the preset keywords may be automatically combined and displayed or separately displayed according to the value of the degree of coupling between the preset keywords, so that the user can view and edit the preset keywords conveniently.
According to a second aspect of the embodiments of the present invention, there is provided a data processing apparatus for a server, including:
a memory and a processor;
the memory is used for storing data used when the processor executes a computer program;
the processor is configured to execute a computer program to implement the method as described in the first aspect or any embodiment of the first aspect.
The processor is configured to:
sending text topic data to a plurality of clients, and receiving text reply data corresponding to the text topic data fed back by each client;
determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword;
acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword;
and determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords.
In one embodiment, preferably, determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword includes:
performing word segmentation processing on each text reply data to obtain a plurality of target keywords;
and determining a one-dimensional weight value corresponding to each target keyword according to the occurrence frequency of each target keyword in the text reply data.
In one embodiment, preferably, after obtaining the plurality of target keywords, the method further includes:
acquiring a keyword storage word bank, wherein a plurality of preset keywords and coupling degrees among different preset keywords are stored in the keyword storage word bank;
determining whether a first target keyword and a second target keyword which can be combined exist in the plurality of target keywords according to the coupling degree between the different preset keywords;
when a first target keyword and a second target keyword which can be combined exist, combining the first target keyword and the second target keyword.
In one embodiment, preferably, determining whether there are a first target keyword and a second target keyword that can be merged in the plurality of target keywords according to the coupling degree between the different preset keywords includes:
acquiring a target preset keyword pair with the coupling degree within a preset range from the keyword storage word library;
judging whether the target preset keyword pair exists in the target keywords or not;
and when the target preset keyword pairs exist in the target keywords, determining that a first keyword and a second keyword which can be combined exist in the target keywords.
In one embodiment, preferably, the method further comprises:
displaying preset keywords in the keyword storage word library in a preset display mode;
receiving a merging processing operation which is input by a user and is used for merging a first preset keyword and a second preset keyword, merging and displaying the first preset keyword and the second preset keyword according to the merging processing operation, and adding 1 to the coupling degree of the first preset keyword and the second preset keyword in a keyword storage word library; or
Receiving a separation processing operation which is input by a user and separates a first preset keyword and a second preset keyword which are combined and displayed, according to the separation processing operation, separating and displaying the first preset keyword and the second preset keyword, and reducing the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library by 1.
In one embodiment, preferably, the method further comprises:
when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is greater than a first preset threshold value, combining and displaying the first preset keyword and the second preset keyword;
and when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is smaller than a second preset threshold value, the first preset keyword and the second preset keyword are displayed in a separated mode.
In one embodiment, preferably, determining target preset reference text data with the highest relevance to the text reply data according to the target keyword and the one-dimensional weight value corresponding to the target keyword, and the reference keyword and the one-dimensional weight value corresponding to the reference keyword, includes:
calculating the correlation degree between the text reply data and the preset reference text data according to the target keyword and the one-dimensional weight value corresponding to the target keyword as well as the reference keyword and the one-dimensional weight value corresponding to the reference keyword;
and determining the preset reference text data with the highest correlation degree as the target preset reference text data.
In one embodiment, preferably, the acquiring a plurality of preset reference text data includes:
storing preset multimedia data with different formats;
and converting the preset multimedia data with different formats into text data, and taking the text data as the preset reference text data.
According to a third aspect of embodiments of the present invention, there is provided a data processing system including:
a server;
a plurality of clients coupled with the server;
the server sends text topic data to a plurality of clients, receives text reply data corresponding to the text topic data fed back by each client, and determines a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword; acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword; and determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A data processing method for a server, comprising:
sending text topic data to a plurality of clients, and receiving text reply data corresponding to the text topic data fed back by each client;
determining a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword;
acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword;
determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords;
specifically, word segmentation processing is carried out on each text reply data to obtain a plurality of target keywords;
and determining a one-dimensional weight value corresponding to each target keyword according to the occurrence frequency of each target keyword in the text reply data.
2. The data processing method of claim 1, wherein after obtaining the plurality of target keywords, the method further comprises:
acquiring a keyword storage word bank, wherein a plurality of preset keywords and coupling degrees among different preset keywords are stored in the keyword storage word bank;
determining whether a first target keyword and a second target keyword which can be combined exist in the plurality of target keywords according to the coupling degree between the different preset keywords;
when a first target keyword and a second target keyword which can be combined exist, combining the first target keyword and the second target keyword.
3. The data processing method of claim 2, wherein determining whether there are a first target keyword and a second target keyword that can be merged in the plurality of target keywords according to the coupling degree between the different preset keywords comprises:
acquiring a target preset keyword pair with the coupling degree within a preset range from the keyword storage word library;
judging whether the target preset keyword pair exists in the target keywords or not;
and when the target preset keyword pairs exist in the target keywords, determining that a first keyword and a second keyword which can be combined exist in the target keywords.
4. The data processing method of claim 2, wherein the method further comprises:
displaying preset keywords in the keyword storage word library in a preset display mode;
receiving a merging processing operation which is input by a user and is used for merging a first preset keyword and a second preset keyword, merging and displaying the first preset keyword and the second preset keyword according to the merging processing operation, and adding 1 to the coupling degree of the first preset keyword and the second preset keyword in a keyword storage word library; or
Receiving a separation processing operation which is input by a user and separates a first preset keyword and a second preset keyword which are combined and displayed, separating and displaying the first preset keyword and the second preset keyword according to the separation processing operation, and reducing the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library by 1.
5. The data processing method of claim 4, wherein the method further comprises:
when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is greater than a first preset threshold value, the first preset keyword and the second preset keyword are combined and displayed;
and when the coupling degree of the first preset keyword and the second preset keyword in the keyword storage word library is smaller than a second preset threshold value, the first preset keyword and the second preset keyword are displayed in a separated mode.
6. The data processing method of claim 1, wherein determining the target preset reference text data with the highest relevance to the text reply data according to the target keyword and the one-dimensional weight value corresponding thereto, and the reference keyword and the one-dimensional weight value corresponding thereto, comprises:
calculating the correlation degree between the text reply data and the preset reference text data according to the target keyword and the one-dimensional weight value corresponding to the target keyword as well as the reference keyword and the one-dimensional weight value corresponding to the reference keyword;
and determining the preset reference text data with the highest correlation degree as the target preset reference text data.
7. The data processing method according to claim 1, wherein the obtaining a plurality of predetermined reference text data comprises:
storing preset multimedia data with different formats;
and converting the preset multimedia data with different formats into text data, and taking the text data as the preset reference text data.
8. A data processing apparatus for a server, comprising:
a memory and a processor;
the memory is used for storing data used when the processor executes a computer program;
the processor is configured to execute a computer program to implement the method of any one of claims 1 to 7.
9. A data processing system, comprising:
a server;
a plurality of clients coupled with the server;
the server sends text topic data to a plurality of clients, receives text reply data corresponding to the text topic data fed back by each client, and determines a plurality of target keywords corresponding to the text reply data and a one-dimensional weight value corresponding to each target keyword; acquiring a plurality of preset reference text data, and determining a reference keyword corresponding to each preset reference text data and a one-dimensional weight value corresponding to each reference keyword; determining target preset reference text data with the highest correlation degree with the text reply data according to the target keywords and the one-dimensional weight values corresponding to the target keywords as well as the reference keywords and the one-dimensional weight values corresponding to the reference keywords, and specifically, performing word segmentation processing on each text reply data to obtain a plurality of target keywords; and determining a one-dimensional weight value corresponding to each target keyword according to the occurrence frequency of each target keyword in the text reply data.
CN201911274881.5A 2019-12-12 2019-12-12 Data processing method, device and system Active CN111079010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911274881.5A CN111079010B (en) 2019-12-12 2019-12-12 Data processing method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911274881.5A CN111079010B (en) 2019-12-12 2019-12-12 Data processing method, device and system

Publications (2)

Publication Number Publication Date
CN111079010A CN111079010A (en) 2020-04-28
CN111079010B true CN111079010B (en) 2023-03-31

Family

ID=70314161

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911274881.5A Active CN111079010B (en) 2019-12-12 2019-12-12 Data processing method, device and system

Country Status (1)

Country Link
CN (1) CN111079010B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468668A (en) * 2015-10-13 2016-04-06 清华大学 Push method and apparatus for topic in official media news
CN107220386A (en) * 2017-06-29 2017-09-29 北京百度网讯科技有限公司 Information-pushing method and device
CN108804641A (en) * 2018-06-05 2018-11-13 鼎易创展咨询(北京)有限公司 A kind of computational methods of text similarity, device, equipment and storage medium
CN108829822A (en) * 2018-06-12 2018-11-16 腾讯科技(深圳)有限公司 The recommended method and device of media content, storage medium, electronic device
CN108897734A (en) * 2018-06-13 2018-11-27 康键信息技术(深圳)有限公司 User's portrait generation method, device, computer equipment and storage medium
CN110162769A (en) * 2018-07-05 2019-08-23 腾讯科技(深圳)有限公司 Text subject output method and device, storage medium and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10885042B2 (en) * 2015-08-27 2021-01-05 International Business Machines Corporation Associating contextual structured data with unstructured documents on map-reduce

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468668A (en) * 2015-10-13 2016-04-06 清华大学 Push method and apparatus for topic in official media news
CN107220386A (en) * 2017-06-29 2017-09-29 北京百度网讯科技有限公司 Information-pushing method and device
CN108804641A (en) * 2018-06-05 2018-11-13 鼎易创展咨询(北京)有限公司 A kind of computational methods of text similarity, device, equipment and storage medium
CN108829822A (en) * 2018-06-12 2018-11-16 腾讯科技(深圳)有限公司 The recommended method and device of media content, storage medium, electronic device
CN108897734A (en) * 2018-06-13 2018-11-27 康键信息技术(深圳)有限公司 User's portrait generation method, device, computer equipment and storage medium
CN110162769A (en) * 2018-07-05 2019-08-23 腾讯科技(深圳)有限公司 Text subject output method and device, storage medium and electronic device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
News Keyword Extraction for Topic Tracking;Sungjike Lee,Han-Joon Kim;《2008 Fourth International Conference on Networked Computing and Advanced Information Management》;20081212;全文 *
基于词汇链的文本过滤模型;尤文建等;《计算机应用研究》;20030928(第09期);全文 *
基于领域本体的主题信息采集方法;郑国良等;《计算机应用》;20081201(第12期);全文 *

Also Published As

Publication number Publication date
CN111079010A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
JP6511487B2 (en) Method and apparatus for information push
Nguyen et al. Real-time event detection for online behavioral analysis of big social data
US9785888B2 (en) Information processing apparatus, information processing method, and program for prediction model generated based on evaluation information
CN108170692B (en) Hotspot event information processing method and device
US9146915B2 (en) Method, apparatus, and computer storage medium for automatically adding tags to document
US8949242B1 (en) Semantic document analysis
KR101735312B1 (en) Apparatus and system for detecting complex issues based on social media analysis and method thereof
JP2019519019A (en) Method, apparatus and device for identifying text type
US11036818B2 (en) Method and system for detecting graph based event in social networks
CN110210038B (en) Core entity determining method, system, server and computer readable medium thereof
CN113688310A (en) Content recommendation method, device, equipment and storage medium
CN110297967B (en) Method, device and equipment for determining interest points and computer readable storage medium
JP6042790B2 (en) Trend analysis apparatus, trend analysis method, and trend analysis program
JP2018088051A (en) Information processing device, information processing method and program
CN111930949B (en) Search string processing method and device, computer readable medium and electronic equipment
CN106484773B (en) Method and device for determining weight of keyword of multimedia resource
CN111882224A (en) Method and device for classifying consumption scenes
CN111079010B (en) Data processing method, device and system
CN111090741B (en) Data processing method, device and system
CN116308704A (en) Product recommendation method, device, electronic equipment, medium and computer program product
CN114445043B (en) Open ecological cloud ERP-based heterogeneous graph user demand accurate discovery method and system
CN110750708A (en) Keyword recommendation method and device and electronic equipment
KR102078541B1 (en) Issue interest based news value evaluation apparatus and method, storage media storing the same
CN110535749A (en) Talk with method for pushing, device, electronic equipment and storage medium
CN117131197B (en) Method, device, equipment and storage medium for processing demand category of bidding document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant