CN111209277A - Data processing method, device, equipment and medium - Google Patents

Data processing method, device, equipment and medium Download PDF

Info

Publication number
CN111209277A
CN111209277A CN202010010403.XA CN202010010403A CN111209277A CN 111209277 A CN111209277 A CN 111209277A CN 202010010403 A CN202010010403 A CN 202010010403A CN 111209277 A CN111209277 A CN 111209277A
Authority
CN
China
Prior art keywords
data
determining
target
category
exploration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010010403.XA
Other languages
Chinese (zh)
Other versions
CN111209277B (en
Inventor
杨溥
罗西琳
张聪聪
侯成龙
叶晓波
吕金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mind Creation Information Technology Co ltd
Original Assignee
Beijing Mind Creation Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mind Creation Information Technology Co ltd filed Critical Beijing Mind Creation Information Technology Co ltd
Priority to CN202010010403.XA priority Critical patent/CN111209277B/en
Publication of CN111209277A publication Critical patent/CN111209277A/en
Application granted granted Critical
Publication of CN111209277B publication Critical patent/CN111209277B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Abstract

The embodiment of the specification discloses a data processing method, a device, equipment and a medium, wherein the data processing method comprises the following steps: receiving source data sent by a terminal, and determining target category data associated with the source data; sending the target category data to the terminal so that the terminal determines and displays exploration data according to the target category data; receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data; and determining and sending the target data under the selected category to the terminal.

Description

Data processing method, device, equipment and medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and medium.
Background
In the prior art, a user can obtain corresponding information by inputting a keyword, but the obtained information is limited to a certain field or subject and has a narrow range.
In view of the above, there is a need for a more efficient and effective data processing scheme.
Disclosure of Invention
Embodiments of the present specification provide a data processing method, apparatus, device, and medium, so as to solve a technical problem of how to perform data processing more efficiently and effectively.
In order to solve the above technical problem, the embodiments of the present specification are implemented as follows:
an embodiment of the present specification provides a first data processing method, including:
receiving source data sent by a terminal, and determining target category data associated with the source data;
sending the target category data to the terminal so that the terminal determines and displays exploration data according to the target category data;
receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and determining and sending the target data under the selected category to the terminal.
An embodiment of the present specification provides a second data processing method, including:
sending source data to a server to cause the server to determine destination category data associated with the source data;
receiving the target category data sent by a server, and determining and displaying exploration data according to the target category data;
determining and sending feedback data aiming at the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
and receiving the target data sent by the server and displaying the target data.
An embodiment of the present specification provides a third data processing method, including:
receiving source data, and determining target category data associated with the source data;
determining and displaying exploration data according to the target category data;
receiving feedback data aiming at the exploration data, and determining a selected category in the target category data according to the feedback data;
and determining and displaying the target data under the selected category.
An embodiment of the present specification provides a data processing apparatus, including:
the target category determining module is used for receiving source data sent by a terminal and determining target category data associated with the source data;
the exploration module is used for sending the target category data to the terminal so that the terminal can determine and display exploration data according to the target category data;
the feedback module is used for receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and the target data determining module is used for determining and sending the target data under the selected category to the terminal.
An embodiment of the present specification provides a data processing apparatus, including:
the source data sending module is used for sending source data to the server so that the server determines target category data associated with the source data;
the exploration data module is used for receiving the target category data sent by the server, and determining and displaying exploration data according to the target category data;
the feedback module is used for determining and sending feedback data aiming at the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
and the display module is used for receiving the target data sent by the server and displaying the target data.
An embodiment of the present specification provides a data processing apparatus, including:
the target data determining module is used for receiving source data and determining target category data associated with the source data;
the exploration module is used for determining and displaying exploration data according to the target category data;
a feedback module, configured to receive feedback data for the exploration data, and determine a selected category in the target category data according to the feedback data;
and the target data determining module is used for determining and displaying the target data under the selected category.
An embodiment of the present specification provides a data processing apparatus, including:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the first data processing method described above.
An embodiment of the present specification provides a data processing apparatus, including:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the second data processing method described above.
An embodiment of the present specification provides a data processing apparatus, including:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the third data processing method described above.
Embodiments of the present specification provide a computer-readable storage medium, which stores computer-executable instructions, and when executed by a processor, the computer-executable instructions implement the first data processing method described above.
Embodiments of the present specification provide a computer-readable storage medium, which stores computer-executable instructions, and when executed by a processor, the computer-executable instructions implement the second data processing method described above.
Embodiments of the present specification provide a computer-readable storage medium, which stores computer-executable instructions, and when the computer-executable instructions are executed by a processor, the third data processing method is implemented.
The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:
the target category associated with the source data is determined, and the exploration data is determined according to the target category, so that the exploration data is not limited to a single category, exploration of various categories associated with the source data is realized, and comprehensiveness and richness of data processing are improved; the interested category of the user can be determined through the feedback data, and then the target data is determined, so that the determination of the target data is more comprehensive and accurate.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present specification or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.
Fig. 1 is a flowchart illustrating a data processing method in the first embodiment of the present specification.
Fig. 2 is a schematic diagram showing the execution of the data processing method in the first embodiment of the present specification.
Fig. 3 is a schematic view of a page in the first embodiment of the present specification.
Fig. 4 is another schematic view of a page in the first embodiment of the present specification.
Fig. 5 is another schematic view of a page in the first embodiment of the present specification.
Fig. 6 is a flowchart illustrating a data processing method in a second embodiment of the present specification.
Fig. 7 is a flowchart illustrating a data processing method in the third embodiment of the present specification.
Fig. 8 is a schematic structural diagram of a data processing apparatus in a fourth embodiment of this specification.
Fig. 9 is a schematic configuration diagram of a data processing apparatus in a fifth embodiment of the present specification.
Fig. 10 is a schematic configuration diagram of a data processing apparatus in a sixth embodiment of the present specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any inventive step based on the embodiments of the present disclosure, shall fall within the scope of protection of the present application.
As shown in fig. 1, a first embodiment of the present specification provides a data processing method. The execution subject of the embodiment may be a computer or a server or a corresponding data processing system, that is, the execution subject may be various and may be set or changed according to the actual situation. In addition, a third-party application may assist the execution main body to execute the embodiment, for example, the data processing method in the embodiment may be executed by a server, and a corresponding application may also be installed on a terminal (held by the user) (including but not limited to a mobile phone and a computer), the server corresponds to the application, data transmission may be performed between the server and the terminal held by the user, and a page or information presentation or data input and output may be performed to the user through the terminal or the application, as shown in fig. 2.
The data processing method provided by the embodiment comprises the following steps:
s101: receiving source data sent by a terminal, and determining target category data associated with the source data.
In this embodiment, the terminal may be installed with an application client for user operation. The client can correspond to the first server and are intercommunicated and interconnected, and the client can perform data transmission or data interaction with the server. Sending data (messages, information, applications, requests, instructions, etc. can all be regarded as data) from a client to a server can be initiated by some operation of the client by a user and can be regarded as meaning representation of the user. Of course, sending data (messages, information, applications, requests, instructions, etc. can be regarded as data) from the client to the server can also be automatically initiated by the client. In this embodiment, the terminal includes, but is not limited to, a mobile phone (of the user), a computer, a pad, and the like. As described above, the execution subject of this embodiment may be a server, and then the terminal may perform data interaction with the server.
The terminal (or client, the same below) may have a corresponding page (which may be a corresponding application page) for the user to select or input text, and the content of the option on the selected page may be text, as shown in fig. 3. After the user selects or inputs the text, the terminal can receive the text selected or input by the user and takes the text selected or input by the user as source data; or the terminal may perform necessary processing (e.g., encryption) on the text selected or input by the user, and use the data subjected to the necessary processing as source data.
After the terminal determines the source data, the source data can be automatically sent to the server, and the server receives the source data sent by the terminal and determines the target category data associated with the source data.
Wherein determining the target category data associated with the source data comprises:
s1011: classification number data (or classification number) corresponding to the source data is determined.
The classification number can be set by itself, and in particular, the classification number data can be a middle drawing classification number. The Chinese chart classification number refers to a classification code obtained by performing theme analysis on scientific and technical documents by using a Chinese library classification method and organizing the documents in classification according to subject attributes and characteristics of document contents.
In this embodiment, a knowledge point database may be constructed, including:
acquiring network data (namely, internet public data, data content and source are not limited), and processing the network data by using a regular expression and a result word;
and carrying out named entity recognition and data cleaning on the processed data so as to determine knowledge points. Wherein, named entity recognition can be carried out on the processed data through Stanford NLP.
The data cleaning can comprise data consistency processing and/or missing value processing and/or abnormal value processing and/or de-duplication processing on the data. Wherein:
(1) data consistency processing: and the rules are used for converting simplified and traditional forms, capital and small cases and symbols, so that the data consistency is ensured. For example: the record A { 'publicDate': 2019-10-10'}, the record B {' publicDate ': 2019/10/1020: 12:10' }, and the record B is uniformly converted into '2019-10-10' by using the to _ datetime method.
(2) Missing value processing: filling missing values is completed by using the fillna method of pandas. For example: when the economic book is filled by default with a specified value, firstly, the missing condition of each field is checked by utilizing a lambda function: { 'abstrate' 0.1667, 'publicdate' 0.3333, 'classification' 0.1667 }. The specified value padded in the absence of abstrat (summary) is the first 100 characters of the introduction, the specified value padded in the absence of publicdate is '0000-00-00', and the specified value padded in the absence of classification is 'economy'.
(3) Abnormal value processing: outlier detection is performed using fast clustering, for example: the input data set S contains N records { 'ISBN': 9787513334181',' title ': parent' story ',' publicdate ': 2018-12-12' }, { 'ISBN': 9787513334181',' title ': parent' story ',' publicdate ': 2010-10-10' }, { 'ISBN': 9787513334181',' title 'parent' story ',' publicdate ': 2018-12-12' }. The isolated points { 'ISBN': 9787513334181',' title ': parent' story ',' publicdate ': 2010-10-10' } are noise data, the noise attribute is publicdate, and the piece of data is culled using a capping method on publicdate in conjunction with the quantile method.
(4) And (3) duplicate removal treatment: and obtaining a context vector of the knowledge point by using a BERT pre-training model, and obtaining a knowledge point B which is most similar to a certain knowledge point A by using a locality sensitive hashing algorithm and Euclidean distance calculation. And if the distance between the two is less than a certain threshold, the two are considered as the same knowledge point. For example: the original text is that' most internet companies play free cards, the free mode is already in the bonus period, if the marginal cost can approach zero infinitely, the free mode can be adopted, and the free mode and the marginal cost are obtained after processing.
Through the process, the knowledge points can be determined according to the network public data, and the determined knowledge points can be in the shapes of 'Matai effect', 'electronic' and the like, so that a knowledge point database (namely a knowledge point database) can be constructed. Through the knowledge point database construction mode, the knowledge points are covered comprehensively, the latest knowledge points can be covered, and the timeliness is higher.
It should be noted that the content of the options on the terminal selection page may be a knowledge point, such as a popular knowledge point, so that the user may directly select the knowledge point, and the source data may be the knowledge point or the source data may be generated according to the selected knowledge point. A number of knowledge points are shown in figure 3 for selection.
In this embodiment, the middle map classification number corresponding to each knowledge point may be determined, and one knowledge point may correspond to one or more middle map classification numbers. For example, if a knowledge point appears in a plurality of books, each book corresponds to a middle map classification number, then the most corresponding one or more middle map classification numbers in the plurality of books in parallel or in front may be used as the middle map classification number corresponding to the knowledge point. The middle map classification number can cover various disciplines, and the middle map classification number is adopted as classification number data in the embodiment, so that most knowledge points can find the corresponding classification number.
In this embodiment, determining the classification number data corresponding to the source data may include:
after receiving the source data, the server can determine the knowledge points corresponding to the source data, and determine the middle map classification number corresponding to the source data according to the corresponding relationship between the knowledge points and the middle map classification numbers. The knowledge points corresponding to the source data can be determined by matching the source data with the knowledge points by adopting a distance algorithm, and a matching condition or a threshold value can be set, so that the knowledge points matched with the source data are determined to be the knowledge points corresponding to the source data, and the middle map classification number corresponding to the knowledge points matched with the source data can be used as the middle map classification number corresponding to the source data. There may be a plurality of knowledge points corresponding to the source data.
S1013: and determining target category data associated with the source data according to the classification number data.
The middle map classification number corresponds to a name, such as 'E military', 'F economy', and the like, and can also have a more detailed name, such as 'F3 agricultural economy', 'F4 industrial economy', and the like under the 'F economy'. After a knowledge point database is built and the corresponding relationship between knowledge points and the classification numbers of the middle map is established, the number of corresponding knowledge points under the classification numbers of the middle map in the hierarchy may be different, for example, one hundred thousand knowledge points may be corresponding under "F economy", four thousand knowledge points may be corresponding under "F3 agricultural economy", ten thousand knowledge points may be corresponding under "F30 agricultural economy", five thousand knowledge points may be corresponding under "F4 industrial economy", and three thousand knowledge points … … may be corresponding under "F41 world industrial economy".
In this embodiment, the name of the middle drawing classification number may be used as the category name in this embodiment, or the name of the middle drawing classification number may be used as the category name in this embodiment after being appropriately processed (for example, some middle drawing classification numbers have longer names and are shortened), so as to establish the correspondence between the classification number and the category.
Further, if there are few knowledge points under a certain classification number (a limit value may be set), the category may not be determined according to the title of the classification number. For example, if there are few knowledge points under "F011 economics object and method", the category may be determined not by the classification number but by using the "F01 economics basic problem" at the top level. Therefore, on one hand, the number of the categories can be reduced, on the other hand, each category can correspond to more knowledge points, and the method is favorable for improving the accuracy of each classification.
It can be seen that, after receiving the source data, the server may determine the classification number data corresponding to the source data according to the correspondence between the source data and the knowledge points and the correspondence between the knowledge points and the classification numbers, may determine the category corresponding to the source data according to the correspondence between the classification number data and the category, and may use the category corresponding to the source data as the associated category data (or associated category) of the source data, that is, the category associated with the source data. Since the knowledge points correspond to the source data, the associated category of the source data can be regarded as the associated category of the knowledge points corresponding to the source data.
After the association category of the source data is determined, the association degree of the source data and each association category can be determined, and the target category data is determined according to the association degree. Specifically, mutual information between the source data and each associated category thereof may be determined, and the association degree between the source data and each associated category thereof may be determined according to the mutual information.
In this embodiment, the mutual information between the "knowledge point corresponding to the source data" and each association category of the source data may be used as the mutual information between the source data and each association category thereof. For example, the knowledge point corresponding to the source data is knowledge point a, the associated categories of the source data or knowledge point a include categories F1, F2,. and Fn, then mutual information log { P (a | F1)/P (a) }, log { P (a | F2)/P (a)) },. and log { P (a | Fn)/P (a)) } of knowledge point a with categories F1, F2,. and Fn may be calculated, and the mutual information may represent the degree of correlation of knowledge point a with domains F1, F2,. and Fn. Comparing the mutual information of the knowledge point a with the categories F1, F2,.. and Fn corresponds to comparing P (a | F1)/, P (a | F2),. and P (a | Fn), where P (a | Fi) ═ P (a, Fi)/P (Fi), P (a, Fi) and P (Fi) are respectively expressed by the frequency of occurrence, i ═ 1,2, … …, n, i.e., K categories (TopK) most relevant to the knowledge point a, i.e., K categories (TopK) most relevant to the source data, and K ≦ n, which may be the target category data associated with the source data.
S103: and sending the target category data to the terminal so that the terminal determines and displays exploration data according to the target category data.
After determining the target category data, the server may send the target category data to the terminal. After receiving the target category data, the terminal may determine data to be presented according to the target category data, for example, the target category data may be used as the data to be presented, or the target category data may be subjected to appropriate processing (for example, format conversion, etc.) and then used as the data to be presented, and the data may be presented.
In this embodiment, the target category determined according to the source data is not limited to a single category, and in particular, the target category is selected from categories corresponding to the middle map classification number, and the category data can also widely cover various knowledge points due to the wide coverage of the middle map classification number, so that the target category also has wide coverage, and is not limited to a single category or subject or field, and then the presentation data determined according to the target category is not limited to a single category or subject or field. The source data to the presentation data are data obtained by searching each category, subject, or field based on the source data, and thus may be referred to as search data. If there are multiple target classes, then there may be multiple exploration data, and there may be one-to-one correspondence with the target classes. As shown in FIG. 4, the related fields 1-6 can be search data, and the keywords can be selected or input by the user.
S105: and receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data.
The terminal presents the exploration data for selection by the user, and each exploration data can be used as an option. After the user clicks or selects certain exploration data, feedback for the exploration data is made. The terminal can receive feedback of the user on the exploration data, form feedback data and send the feedback data to the server, the server receives the feedback data aiming at the exploration data and sent by the terminal, and the selected category in the target category data is determined according to the feedback data. For example, there are m categories of target category data and m exploration data, and when a user clicks one of the categories, the feedback data records the exploration data selected by the user, that is, the selected target category, i.e., category, of the target categories.
S107: and determining and sending the target data under the selected category to the terminal.
After determining the selected category, the server may determine target data under the selected category, where the target data may be knowledge points under the selected category corresponding to the source data. Specifically, TF-IDF (word frequency inverse document frequency) algorithm may be used to count more knowledge points mentioned (limit values may be set) under each target category, for example, categories F1, F2,. and Fq are target categories, and more knowledge points mentioned under these target categories include knowledge point a1, knowledge point a2, … …, and knowledge point aj; if the category Fq is the selected category, the target data belonging to the category Fq among the knowledge point a1, the knowledge points a2, … …, and the knowledge point aj may be determined. Other keywords in fig. 5 may be target data.
In the embodiment, data exploration is carried out based on the source data, exploration of various categories related to the source data is realized, and the obtained target category data has wide coverage, so that more comprehensive and richer exploration data can be displayed for a user; the target data are determined based on the feedback data, so that the target data are accurately positioned on the basis of the widely explored technology, and the comprehensiveness and the accuracy of the target data are reflected; the data processing process of the embodiment is based on the source data (the source data reflects the user interest), the exploration data (reflects the comprehensiveness), the feedback data (the feedback data reflects the user interest again) and the target data (reflects the accuracy), so that the comprehensiveness and richness of data processing are improved, and the target data is determined more comprehensively and accurately.
The first embodiment of this specification is from the perspective of a server, and the second embodiment of this specification provides a data processing method from the perspective of a terminal, as shown in fig. 6, the data processing method in this embodiment includes:
s201: sending source data to a server to cause the server to determine destination category data associated with the source data;
s203: receiving the target category data sent by a server, and determining and displaying exploration data according to the target category data;
s205: determining and sending feedback data aiming at the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
s207: and receiving the target data sent by the server and displaying the target data.
The first embodiment can be referred to for details of this embodiment.
As shown in fig. 7, a third embodiment of the present description uses a terminal as an execution subject, and a data processing method in this embodiment includes:
s301: receiving source data, and determining target category data associated with the source data.
S303: and determining and displaying exploration data according to the target category data.
S305: receiving feedback data aiming at the exploration data, and determining a selected category in the target category data according to the feedback data;
s307: and determining and displaying the target data under the selected category.
In this embodiment, the terminal is used as an execution subject and may not perform data interaction with the server, the rules, principles, standards, and the like according to this embodiment may be referred to in the first embodiment, and the contents of this embodiment, which are not described in detail, may be referred to in the first embodiment
As shown in fig. 8, a fourth embodiment of the present specification provides a data processing apparatus including:
a target category determining module 401, configured to receive source data sent by a terminal, and determine target category data associated with the source data;
an exploration module 403, configured to send the target category data to the terminal, so that the terminal determines and displays exploration data according to the target category data;
a feedback module 405, configured to receive feedback data for the exploration data sent by the terminal, and determine a selected category in the target category data according to the feedback data;
and the target data determining module 407 is configured to determine and send the target data in the selected category to the terminal.
Optionally, determining the target category data associated with the source data includes:
determining classification number data corresponding to the source data;
and determining target category data associated with the source data according to the classification number data.
Optionally, the classification number data is a middle map classification number.
Optionally, determining the classification number data corresponding to the source data includes:
after source data are received, determining knowledge points corresponding to the source data, and determining a middle map classification number corresponding to the source data according to the corresponding relation between the knowledge points and the middle map classification numbers.
Optionally, the apparatus further comprises:
and the knowledge point module is used for constructing a knowledge point database and determining the middle map classification number corresponding to each knowledge point.
Optionally, the constructing the knowledge point database includes:
acquiring network data, and processing the network data by using a regular expression and a settlement word;
and carrying out named entity recognition and data cleaning on the processed data to determine knowledge points.
Optionally, the data cleansing includes:
and carrying out data consistency processing and/or missing value processing and/or abnormal value processing and/or de-duplication processing on the data.
Optionally, named entity recognition is performed on the processed data through the Stanford NLP.
Optionally, determining the target category data associated with the source data according to the classification number data includes:
determining associated category data of the source data according to the corresponding relation between the classification number and the category data;
determining the degree of association of the source data and the association category data;
and determining target category data according to the association degree.
Optionally, mutual information between the source data and the associated category data is determined, and the association degree is determined according to the mutual information.
As shown in fig. 9, a fifth embodiment of the present specification provides a data processing apparatus including:
a source data sending module 501, configured to send source data to a server, so that the server determines target category data associated with the source data;
the exploration data module 503 is configured to receive the target category data sent by the server, determine and display exploration data according to the target category data;
a feedback module 505, configured to determine and send feedback data for the exploration data to the server, so that the server determines, according to the feedback data, a selected category in the target category data, and determines target data in the selected category;
and the display module 507 is configured to receive the target data sent by the server and display the target data.
As shown in fig. 10, a sixth embodiment of the present specification provides a data processing apparatus including:
a target data determining module 601, configured to receive source data and determine target category data associated with the source data;
an exploration module 603, configured to determine and display exploration data according to the target category data;
a feedback module 605, configured to receive feedback data for the exploration data, and determine a selected category in the target category data according to the feedback data;
and the target data determination module 607 is used for determining and displaying the target data in the selected category.
A seventh embodiment of the present specification provides a data processing apparatus including:
at least one processor; and a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the data processing method of the first embodiment.
An eighth embodiment of the present specification provides a data processing apparatus including:
at least one processor; and a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the data processing method of the second embodiment.
A ninth embodiment of the present specification provides a data processing apparatus including:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the data processing method of the third embodiment.
A tenth embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method of the first embodiment.
An eleventh embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method of the second embodiment.
A twelfth embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method in the third embodiment.
The above embodiments may be used in combination.
While certain embodiments of the present disclosure have been described above, other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily have to be in the particular order shown or in sequential order to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, device, and non-volatile computer-readable storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and in relation to the description, reference may be made to some portions of the description of the method embodiments.
The apparatus, the device, the nonvolatile computer readable storage medium, and the method provided in the embodiments of the present specification correspond to each other, and therefore, the apparatus, the device, and the nonvolatile computer storage medium also have similar advantageous technical effects to the corresponding method.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), AHDL (advanced Hardware description ip address) Language, traffic, CUPL (core University Programming Language), HDCal, JHDL (Java Hardware description ip address Language), Lava, Lola, HDL, PALASM, palms, rhyd (Hardware runtime Language), and Hardware Language (Hardware Language-Language) which is currently used by native Language. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, MicrochIP address PIC18F26K20, and Silicone LabsC8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.
As will be appreciated by one skilled in the art, the present specification embodiments may be provided as a method, system, or computer program product. Accordingly, embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (21)

1. A method of data processing, comprising:
receiving source data sent by a terminal, and determining target category data associated with the source data;
sending the target category data to the terminal so that the terminal determines and displays exploration data according to the target category data;
receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and determining and sending the target data under the selected category to the terminal.
2. The method of claim 1, determining target category data associated with the source data comprises:
determining classification number data corresponding to the source data;
and determining target category data associated with the source data according to the classification number data.
3. The method of claim 2, the classification number data being a middle graph classification number.
4. The method of claim 3, determining classification number data corresponding to the source data comprises:
after source data are received, determining knowledge points corresponding to the source data, and determining a middle map classification number corresponding to the source data according to the corresponding relation between the knowledge points and the middle map classification numbers.
5. The method of claim 4, further comprising:
and constructing a knowledge point database, and determining the middle map classification number corresponding to each knowledge point.
6. The method of claim 5, building a knowledge point database comprising:
acquiring network data, and processing the network data by using a regular expression and a settlement word;
and carrying out named entity recognition and data cleaning on the processed data to determine knowledge points.
7. The method of claim 6, the data cleansing comprising:
and carrying out data consistency processing and/or missing value processing and/or abnormal value processing and/or de-duplication processing on the data.
8. The method of claim 6, named entity recognition is performed on the processed data by Stanford NLP.
9. The method of claim 2, determining the target category data associated with the source data from the classification number data comprises:
determining associated category data of the source data according to the corresponding relation between the classification number and the category data;
determining the degree of association of the source data and the association category data;
and determining target category data according to the association degree.
10. The method of claim 9, determining mutual information of the source data and the associated category data, determining the association degree according to the mutual information.
11. A method of data processing, comprising:
sending source data to a server to cause the server to determine destination category data associated with the source data;
receiving the target category data sent by a server, and determining and displaying exploration data according to the target category data;
determining and sending feedback data aiming at the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
and receiving the target data sent by the server and displaying the target data.
12. A method of data processing, comprising:
receiving source data, and determining target category data associated with the source data;
determining and displaying exploration data according to the target category data;
receiving feedback data aiming at the exploration data, and determining a selected category in the target category data according to the feedback data;
and determining and displaying the target data under the selected category.
13. A data processing apparatus comprising:
the target category determining module is used for receiving source data sent by a terminal and determining target category data associated with the source data;
the exploration module is used for sending the target category data to the terminal so that the terminal can determine and display exploration data according to the target category data;
the feedback module is used for receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and the target data determining module is used for determining and sending the target data under the selected category to the terminal.
14. A data processing apparatus comprising:
the source data sending module is used for sending source data to the server so that the server determines target category data associated with the source data;
the exploration data module is used for receiving the target category data sent by the server, and determining and displaying exploration data according to the target category data;
the feedback module is used for determining and sending feedback data aiming at the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
and the display module is used for receiving the target data sent by the server and displaying the target data.
15. A data processing apparatus comprising:
the target data determining module is used for receiving source data and determining target category data associated with the source data;
the exploration module is used for determining and displaying exploration data according to the target category data;
a feedback module, configured to receive feedback data for the exploration data, and determine a selected category in the target category data according to the feedback data;
and the target data determining module is used for determining and displaying the target data under the selected category.
16. A data processing apparatus comprising:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of any one of claims 1 to 10.
17. A data processing apparatus comprising:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of claim 11.
18. A data processing apparatus comprising:
at least one processor;
and the number of the first and second groups,
a memory communicatively coupled to the at least one processor;
wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of claim 12.
19. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method of any one of claims 1 to 10.
20. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method of claim 11.
21. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method of claim 12.
CN202010010403.XA 2020-01-06 2020-01-06 Data processing method, device, equipment and medium Active CN111209277B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010010403.XA CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010010403.XA CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111209277A true CN111209277A (en) 2020-05-29
CN111209277B CN111209277B (en) 2023-11-24

Family

ID=70788605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010010403.XA Active CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111209277B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000694A (en) * 2020-09-11 2020-11-27 支付宝(杭州)信息技术有限公司 Data acquisition method, device, equipment and medium
CN114021709A (en) * 2021-09-30 2022-02-08 苏州浪潮智能科技有限公司 Multi-FPGA data processing method and device, server and storage medium
CN112000694B (en) * 2020-09-11 2024-04-26 支付宝(杭州)信息技术有限公司 Data acquisition method, device, equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739843A (en) * 2014-12-08 2016-07-06 阿里巴巴集团控股有限公司 Information display method and apparatus as well as electronic device
US20170098001A1 (en) * 2015-10-05 2017-04-06 Fujitsu Limited Method of outputting recommended item and recommended item output device
CN108694183A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of search method and device
CN108984737A (en) * 2018-07-16 2018-12-11 北京全聘致远科技有限公司 Resume search method and device
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device
CN109558508A (en) * 2018-10-22 2019-04-02 百度在线网络技术(北京)有限公司 Data digging method, device, computer equipment and storage medium
CN109614415A (en) * 2018-09-29 2019-04-12 阿里巴巴集团控股有限公司 A kind of data mining, processing method, device, equipment and medium
CN109740085A (en) * 2019-01-10 2019-05-10 北京字节跳动网络技术有限公司 A kind of methods of exhibiting of content of pages, device, equipment and storage medium
CN109801204A (en) * 2018-08-07 2019-05-24 福州米鱼信息科技有限公司 A kind of personal academic service system and its implementation
CN109871483A (en) * 2019-01-22 2019-06-11 珠海天燕科技有限公司 A kind of determination method and device of recommendation information

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739843A (en) * 2014-12-08 2016-07-06 阿里巴巴集团控股有限公司 Information display method and apparatus as well as electronic device
US20170098001A1 (en) * 2015-10-05 2017-04-06 Fujitsu Limited Method of outputting recommended item and recommended item output device
CN108694183A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of search method and device
CN108984737A (en) * 2018-07-16 2018-12-11 北京全聘致远科技有限公司 Resume search method and device
CN109801204A (en) * 2018-08-07 2019-05-24 福州米鱼信息科技有限公司 A kind of personal academic service system and its implementation
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device
CN109614415A (en) * 2018-09-29 2019-04-12 阿里巴巴集团控股有限公司 A kind of data mining, processing method, device, equipment and medium
CN109558508A (en) * 2018-10-22 2019-04-02 百度在线网络技术(北京)有限公司 Data digging method, device, computer equipment and storage medium
CN109740085A (en) * 2019-01-10 2019-05-10 北京字节跳动网络技术有限公司 A kind of methods of exhibiting of content of pages, device, equipment and storage medium
CN109871483A (en) * 2019-01-22 2019-06-11 珠海天燕科技有限公司 A kind of determination method and device of recommendation information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
詹萌;: "我国图书馆书目数据库的数据特征分析与检索方式扩展研究", no. 07 *
郝玫;王道平;奚;: "基于《中国图书馆分类法》的高校图书馆书目推荐服务研究", no. 12 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000694A (en) * 2020-09-11 2020-11-27 支付宝(杭州)信息技术有限公司 Data acquisition method, device, equipment and medium
CN112000694B (en) * 2020-09-11 2024-04-26 支付宝(杭州)信息技术有限公司 Data acquisition method, device, equipment and medium
CN114021709A (en) * 2021-09-30 2022-02-08 苏州浪潮智能科技有限公司 Multi-FPGA data processing method and device, server and storage medium
CN114021709B (en) * 2021-09-30 2024-01-23 苏州浪潮智能科技有限公司 Multi-FPGA data processing method and device, server and storage medium

Also Published As

Publication number Publication date
CN111209277B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
CN108537568B (en) Information recommendation method and device
TW201942826A (en) Payment mode recommendation method and device and equipment
CN110162796B (en) News thematic creation method and device
CN111898643B (en) Semantic matching method and device
CN110765247A (en) Input prompting method and device for question-answering robot
CN109214193B (en) Data encryption and machine learning model training method and device and electronic equipment
CN105824830A (en) Page displaying method, client and equipment
CN117235226A (en) Question response method and device based on large language model
CN113221555A (en) Keyword identification method, device and equipment based on multitask model
CN111046304B (en) Data searching method and device
CN113887235A (en) Information recommendation method and device
CN109656946A (en) A kind of multilist relation query method, device and equipment
CN111209277B (en) Data processing method, device, equipment and medium
CN112231531A (en) Data display method, equipment and medium based on openstb
CN113887234B (en) Model training and recommending method and device
US20170116317A1 (en) Requesting enrichment for document corpora
CN112182116A (en) Data probing method and device
CN112287130A (en) Searching method, device and equipment for graphic questions
CN112685553A (en) Method, device, equipment and medium for searching and replacing online document
CN117271611B (en) Information retrieval method, device and equipment based on large model
CN113342840A (en) Data determination method, device, equipment and medium
CN113360715A (en) Data determination method, device, equipment and medium
CN117331561B (en) Intelligent low-code page development system and method
CN117494068B (en) Network public opinion analysis method and device combining deep learning and causal inference
CN115017915B (en) Model training and task execution method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 301, 3rd Floor, Building 1, No. 1 Xidawang Road, Chaoyang District, Beijing, 100000

Applicant after: Beijing Mind Creation Information Technology Co.,Ltd.

Address before: Room 2802, 24 / F, building 4, 89 Jianguo Road, Chaoyang District, Beijing

Applicant before: Beijing Mind Creation Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant