CN111209277B - Data processing method, device, equipment and medium - Google Patents

Data processing method, device, equipment and medium Download PDF

Info

Publication number
CN111209277B
CN111209277B CN202010010403.XA CN202010010403A CN111209277B CN 111209277 B CN111209277 B CN 111209277B CN 202010010403 A CN202010010403 A CN 202010010403A CN 111209277 B CN111209277 B CN 111209277B
Authority
CN
China
Prior art keywords
data
target
determining
class
category
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010010403.XA
Other languages
Chinese (zh)
Other versions
CN111209277A (en
Inventor
杨溥
罗西琳
张聪聪
侯成龙
叶晓波
吕金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mind Creation Information Technology Co ltd
Original Assignee
Beijing Mind Creation Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mind Creation Information Technology Co ltd filed Critical Beijing Mind Creation Information Technology Co ltd
Priority to CN202010010403.XA priority Critical patent/CN111209277B/en
Publication of CN111209277A publication Critical patent/CN111209277A/en
Application granted granted Critical
Publication of CN111209277B publication Critical patent/CN111209277B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification discloses a data processing method, a device, equipment and a medium, wherein the data processing method comprises the following steps: receiving source data sent by a terminal, and determining target class data associated with the source data; the target class data is sent to the terminal, so that the terminal determines and displays exploration data according to the target class data; receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data; and determining and sending the target data under the selected category to the terminal.

Description

Data processing method, device, equipment and medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and medium.
Background
In the prior art, a user can obtain corresponding information by inputting keywords, but the obtained information is limited to a certain field or discipline, and the scope is narrow.
In view of this, there is a need for more effective and efficient data processing schemes.
Disclosure of Invention
Embodiments of the present disclosure provide a data processing method, apparatus, device, and medium, so as to solve the technical problem of how to perform data processing more effectively and efficiently.
In order to solve the above technical problems, the embodiments of the present specification are implemented as follows:
the embodiment of the specification provides a first data processing method, which includes:
receiving source data sent by a terminal, and determining target class data associated with the source data;
the target class data is sent to the terminal, so that the terminal determines and displays exploration data according to the target class data;
receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and determining and sending the target data under the selected category to the terminal.
The embodiment of the present specification provides a second data processing method, including:
transmitting source data to a server to enable the server to determine target class data associated with the source data;
receiving the target class data sent by a server, and determining and displaying exploration data according to the target class data;
determining and transmitting feedback data for the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data, and determines target data under the selected category;
And receiving the target data sent by the server and displaying the target data.
The embodiment of the present specification provides a third data processing method, including:
receiving source data, and determining target class data associated with the source data;
determining and displaying exploration data according to the target class data;
receiving feedback data for the exploration data, and determining a selected category in the target category data according to the feedback data;
target data under the selected category is determined and presented.
An embodiment of the present specification provides a data processing apparatus including:
the system comprises a target category determining module, a target category determining module and a target category determining module, wherein the target category determining module is used for receiving source data sent by a terminal and determining target category data associated with the source data;
the exploration module is used for sending the target category data to the terminal so that the terminal can determine and display exploration data according to the target category data;
the feedback module is used for receiving feedback data aiming at the exploration data and sent by the terminal, and determining a selected category in the target category data according to the feedback data;
and the target data determining module is used for determining and sending the target data under the selected category to the terminal.
An embodiment of the present specification provides a data processing apparatus including:
the system comprises a source data sending module, a server and a storage module, wherein the source data sending module is used for sending source data to the server so that the server can determine target category data associated with the source data;
the exploration data module is used for receiving the target category data sent by the server, and determining and displaying exploration data according to the target category data;
the feedback module is used for determining and sending feedback data aiming at the exploration data to the server so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
and the display module is used for receiving the target data sent by the server and displaying the target data.
An embodiment of the present specification provides a data processing apparatus including:
the system comprises a target data determining module, a target data processing module and a target data processing module, wherein the target data determining module is used for receiving source data and determining target category data associated with the source data;
the exploration module is used for determining and displaying exploration data according to the target category data;
the feedback module is used for receiving feedback data aiming at the exploration data and determining a selected category in the target category data according to the feedback data;
And the target data determining module is used for determining and displaying the target data under the selected category.
The embodiment of the present specification provides a data processing apparatus including:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the first data processing method described above.
The embodiment of the present specification provides a data processing apparatus including:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the second data processing method described above.
The embodiment of the present specification provides a data processing apparatus including:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the third data processing method described above.
The present embodiments provide a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the above-described first data processing method.
The present specification embodiment provides a computer-readable storage medium storing computer-executable instructions that when executed by a processor implement the above-described second data processing method.
The present embodiment provides a computer-readable storage medium storing computer-executable instructions that when executed by a processor implement the third data processing method described above.
The above-mentioned at least one technical scheme that this description embodiment adopted can reach following beneficial effect:
determining target categories associated with the source data, and determining exploration data according to the target categories, so that the exploration data is not limited to a single category, exploration of each category associated with the source data is realized, and the comprehensiveness and richness of data processing are improved; the interested category of the user can be determined through the feedback data, and then the target data is determined, so that the determination of the target data is more comprehensive and accurate.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments of the present description or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments described in the present description, and that other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
Fig. 1 is a flow chart of a data processing method in a first embodiment of the present specification.
Fig. 2 is a schematic diagram showing the execution of the data processing method in the first embodiment of the present specification.
Fig. 3 is a schematic view of a page in the first embodiment of the present specification.
Fig. 4 is a schematic view of another page in the first embodiment of the present specification.
Fig. 5 is another page schematic in the first embodiment of the present specification.
Fig. 6 is a flow chart of a data processing method in the second embodiment of the present specification.
Fig. 7 is a flowchart of a data processing method in the third embodiment of the present specification.
Fig. 8 is a schematic structural view of a data processing apparatus in a fourth embodiment of the present specification.
Fig. 9 is a schematic structural view of a data processing apparatus in a fifth embodiment of the present specification.
Fig. 10 is a schematic structural view of a data processing apparatus in a sixth embodiment of the present specification.
Detailed Description
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
As shown in fig. 1, a first embodiment of the present specification provides a data processing method. The execution subject of the embodiment may be a computer or a server or a corresponding data processing system, that is, the execution subject may be various and may be set or changed according to actual situations. In addition, a third party application program may also be used to assist the execution body to execute the embodiment, for example, the data processing method in the embodiment may be executed by a server, and a corresponding application program may also be installed on a terminal (including but not limited to a mobile phone and a computer) held by a user, where the server corresponds to the application program, and data transmission may be performed between the server and the terminal held by the user, and page or information presentation or data input and output may be performed to the user through the terminal or the application program, as shown in fig. 2.
The data processing method provided by the embodiment comprises the following steps:
s101: and receiving source data sent by a terminal, and determining target category data associated with the source data.
In this embodiment, the terminal may be provided with an application client for user operation. The client can correspond to the first server and are mutually communicated with the first server, and the client can conduct data transmission or data interaction with the server. The sending of data (messages, information, applications, requests, instructions, etc. may be considered data) by a client to a server may be initiated by some operation of the client by a user, and may be considered as meaning of the user. Of course, sending data (messages, information, applications, requests, instructions, etc. all can be considered data) from a client to a server can also be initiated automatically by the client. In this embodiment, the terminal includes, but is not limited to, a mobile phone (of a user), a computer, a pad, and the like. As described above, the execution body of the embodiment may be a server, and the terminal may perform data interaction with the server.
The terminal (or client, hereinafter) may have a corresponding page (which may be a corresponding application page) thereon for the user to select or input text, and the option content on the selected page may be text, as shown in fig. 3. After the user selects or inputs the text, the terminal can receive the text selected or input by the user and take the text selected or input by the user as source data; or the terminal may perform necessary processing (e.g., encryption) on the text selected or input by the user, and use the data after the necessary processing as source data.
After the terminal determines the source data, the source data can be automatically sent to a server, and the server receives the source data sent by the terminal and determines target class data associated with the source data.
Wherein determining target class data associated with the source data comprises:
s1011: and determining the classification number data (or classification number) corresponding to the source data.
The classification number can be set by the user, and in particular, the classification number data can be a medium-graph classification number. The class number of the middle graph refers to the classification code obtained by performing topic analysis on technical literature by adopting Chinese library classification method and classifying the literature according to the subject attribute and the characteristic of the literature content.
In this embodiment, a knowledge point database may be constructed, including:
acquiring network data (i.e. internet public data, the data content and the source are not limited), and processing the network data by using regular expressions and bargaining segmentation;
and carrying out named entity recognition and data cleaning on the processed data, thereby determining knowledge points. The named entity recognition can be performed on the processed data through the Stanford NLP.
The data cleaning may include performing data consistency processing and/or missing value processing and/or outlier processing and/or deduplication processing on the data. Wherein:
(1) Data consistency processing: and the conversion of simplified and complex forms, cases and symbols is carried out by utilizing rules, so that the data consistency is ensured. For example: record A { 'public date': '2019-10-10' }, record B { 'public date': '2019/10/10:20:12:10' }, and uniformly converted into '2019-10-10' by using a to_datetime method.
(2) Missing value processing: filling of the missing values is accomplished using the pandas's fillna method. For example: when the economic book defaults to be filled with the appointed value, firstly, checking the missing condition of each field by utilizing a lambda function: { 'abstrack' 0.1667, 'publishDate' 0.3333, 'classification' 0.1667}. The specified value filled in abscission is the first 100 characters of the introduction, the specified value filled in publichDate is '0000-00-00' when publichDate is missing, and the specified value filled in when classification is missing is 'economic'.
(3) Outlier processing: outlier detection is performed using fast clustering, for example: the input data set S contains N records [ { 'ISBN:' 9787513334181',' title: 'parent' S story ',' publishDate: '2018-12-12', { 'ISBN:' 9787513334181',' title: 'parent' S story ',' publishDate: '2010-10-10', { 'ISBN:' 9787513334181',' title: 'parent' S story ',' publishDate: '2018-12-12',. ], where each record is a data point. The isolated points { 'ISBN': '9787513334181', 'title': 'parent's story ',' publishDate ': 2010-10-10' } are noise data, the noise attribute is pubishiDate, and the pubishiDate is stripped out by using a capping method in combination with a quatile method.
(4) And (3) de-duplication treatment: and obtaining a context vector of the knowledge point by using the BERT pre-training model, and obtaining a knowledge point B which is the most similar to a certain knowledge point A by using a local sensitive hash algorithm and Euclidean distance calculation. If the distance between the two is smaller than a certain threshold, the knowledge points are considered to be the same knowledge points. For example: the original text is' most internet companies are free, the free mode has passed the bonus period, if the marginal cost can be infinitely close to zero, the free mode can be adopted, and the free mode and the marginal cost are obtained after processing.
Through the above process, knowledge points can be determined according to the network public data, and the determined knowledge points can be shaped like 'Martai effect', 'electronic', and the like, so that a knowledge point database (i.e. a knowledge point base) can be constructed. By the knowledge point database construction mode, knowledge points are comprehensively covered, and the latest knowledge points can be covered, so that timeliness is higher.
It should be noted that, the content of the options on the terminal selection page may be a knowledge point, for example, a popular knowledge point, so that the user may directly select the knowledge point, and thus the source data may be the knowledge point or the source data may be generated according to the selected knowledge point. A number of knowledge points are shown in fig. 3 for selection.
In this embodiment, a class number of the middle graph corresponding to each knowledge point may be determined, and one knowledge point may correspond to one or more class numbers of the middle graph. For example, a certain knowledge point appears in a plurality of books, each book corresponds to a middle graph class number, and then one or a plurality of middle graph class numbers which correspond to the most or are arranged in parallel or are in front of the plurality of books can be used as the middle graph class number corresponding to the knowledge point. The class number of the middle graph can cover various subjects, and the class number of the middle graph is used as class number data in the embodiment, so that most knowledge points can find the class number corresponding to the knowledge points.
In this embodiment, determining the class number data corresponding to the source data may include:
after the server receives the source data, a knowledge point corresponding to the source data can be determined, and a middle graph class number corresponding to the source data is determined according to the corresponding relation between the knowledge point and the middle graph class number. The knowledge points corresponding to the source data are determined, the source data and the knowledge points can be matched by adopting a distance algorithm, a matching condition or a threshold value can be set, so that the knowledge points matched with the source data are determined as the knowledge points corresponding to the source data, and the class numbers of the middle graph corresponding to the 'knowledge points matched with the source data' can be used as the class numbers of the middle graph corresponding to the source data. There may be a plurality of knowledge points corresponding to the source data.
S1013: and determining target category data associated with the source data according to the classification number data.
The middle diagram classification number corresponds to a name, such as "E military", "F economy", etc., and may also have a finer name, such as "F economy" and "F3 agricultural economy", "F4 industrial economy", etc. After the knowledge point database is built and the correspondence between the knowledge points and the class numbers of the middle graph is built, the number of the knowledge points corresponding to the class numbers of the middle graph in the hierarchy may be different, for example, there may be hundreds of thousands of knowledge points corresponding to the "F economy", there may be forty thousands of knowledge points corresponding to the "F3 agricultural economy", there may be tens of thousands of knowledge points corresponding to the "F30 agricultural economy theory", there may be fifty thousands of knowledge points corresponding to the "F4 industrial economy", and there may be thirty thousands of knowledge points … … corresponding to the "F41 world industrial economy".
In this embodiment, the name of the class number of the middle diagram may be used as the class name in this embodiment, or the name of the class number of the middle diagram may be appropriately processed (for example, some of the class numbers of the middle diagram are longer and shorter) and then used as the class name in this embodiment, so that a correspondence between the class number and the class is established.
Further, if there are few knowledge points (limit values can be set) under a class number, the class may not be determined from the title of the class number. For example, if the "objects and methods of F011 economics" have fewer knowledge points, the category may be determined using the "F01 economics basic problem" of the upper level, instead of determining the category according to the classification number. On the one hand, the number of the categories can be reduced, and on the other hand, each category corresponds to more knowledge points, so that the accuracy of each category is improved.
After receiving the source data, the server can determine class number data corresponding to the source data according to the corresponding relation between the source data and the knowledge points and the corresponding relation between the knowledge points and class numbers, can determine class corresponding to the source data according to the corresponding relation between the class number data and the class, and can take the class corresponding to the source data as the associated class data (or associated class) of the source data, namely the class associated with the source data. Since the knowledge points correspond to the source data, the association class of the source data can be regarded as the association class of the knowledge points corresponding to the source data.
After the association category of the source data is determined, the association degree of the source data and each association category can be determined, and the target category data is determined according to the association degree. Specifically, mutual information of the source data and each associated category can be determined, and the association degree of the source data and each associated category can be determined according to the mutual information.
In this embodiment, the mutual information of each association class of the source data and the "knowledge point corresponding to the source data" may be used as the mutual information of the source data and each association class thereof. For example, the knowledge point corresponding to the source data is the knowledge point a, and the association category of the source data or the knowledge point a includes the categories F1, F2,..and Fn, and then the mutual information log { P (a|f1)/P (a) }, log { P (a|f2)/P (a) },... Comparing knowledge point a with mutual information of categories F1, F2, and Fn, which is equivalent to comparing P (a|f1)/, P (a|f2), and P (a|fn), wherein P (a|fi) =p (a, fi)/P (Fi), P (a, fi) and P (Fi) are represented by occurrence frequencies, i=1, 2, … …, and n, respectively, and K categories most related to knowledge point a, that is, K categories (TopK) most related to source data, k.ltoreq.n, can be determined, and the K categories can be used as target category data related to the source data.
S103: and sending the target class data to the terminal so that the terminal determines and displays the exploration data according to the target class data.
After determining the target class data, the server may send the target class data to the terminal. After receiving the target class data, the terminal can determine the data to be displayed according to the target class data, for example, the target class data can be used as the data to be displayed, or the target class data is used as the data to be displayed after being properly processed (such as format conversion and the like), and the data to be displayed is displayed.
In this embodiment, the target category determined according to the source data is not limited to a single category, in particular, the target category is selected from categories corresponding to class numbers of the middle graph, and the category data can also cover various knowledge points widely due to the wide coverage of class numbers of the middle graph, so that the target category also has wide coverage, and is not limited to a single category or subject or field, and then the presentation data determined according to the target category is not limited to a single category or subject or field. The data from the source data to the presentation data corresponds to data obtained by searching each category, subject, or field based on the source data, and thus may be referred to as search data. If there are a plurality of target categories, there may be a plurality of search data, and the search data may correspond to the target categories one by one. As shown in fig. 4, the related fields 1 to 6 may be search data, and the keywords may be selected or input by the user.
S105: and receiving feedback data aiming at the exploration data and sent by the terminal, and determining a selected category in the target category data according to the feedback data.
The terminal displays the exploration data for the user to select, and each exploration data can serve as an option. After clicking or selecting a piece of exploration data, the user makes feedback on the exploration data. The terminal can receive feedback of the user on the exploration data, form feedback data and send the feedback data to the server, and the server receives the feedback data aiming at the exploration data and sent by the terminal and determines a selected category in the target category data according to the feedback data. For example, there are m categories of target category data, m search data, and the user clicks one of them, and the feedback data records the search data selected by the user, and also records the selected target category, i.e., the category, in the target categories.
S107: and determining and sending the target data under the selected category to the terminal.
After determining the selected category, the server may determine target data under the selected category, where the target data may be knowledge points under the selected category corresponding to the source data. Specifically, a TF-IDF (word frequency inverse document frequency) algorithm may be used to count knowledge points that are more mentioned (limit value may be set) in each target category, for example, categories F1, F2, and Fq are target categories, and the knowledge points that are more mentioned in these target categories include knowledge point a1, knowledge points a2, … …, and knowledge point aj; if the category Fq is the selected category, the knowledge point a1, the knowledge points a2, … …, and the knowledge point aj may be determined to be the target data belonging to the category Fq. The other keywords in fig. 5 may be target data.
In the embodiment, the data exploration is performed based on the source data, so that exploration of each category associated with the source data is realized, and the obtained target category data has wide coverage, so that more comprehensive and richer exploration data can be displayed to a user; the target data is determined based on the feedback data, so that the accurate positioning of the target data is realized on the widely explored technology, and the comprehensiveness and the accuracy of the target data are reflected; the data processing process of the embodiment is based on the source data (the source data reflects the user interest) to the exploration data (the comprehensiveness is reflected), to the feedback data (the feedback data reflects the user interest again), and to the target data (the accuracy is reflected), so that the comprehensiveness and richness of data processing are improved, and the determination of the target data is more comprehensive and accurate.
The first embodiment of the present disclosure provides a data processing method from the perspective of a server, and the second embodiment of the present disclosure provides a data processing method from the perspective of a terminal, as shown in fig. 6, where the data processing method in the present embodiment includes:
s201: transmitting source data to a server to enable the server to determine target class data associated with the source data;
s203: receiving the target class data sent by a server, and determining and displaying exploration data according to the target class data;
S205: determining and transmitting feedback data for the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data, and determines target data under the selected category;
s207: and receiving the target data sent by the server and displaying the target data.
For details of this embodiment, reference may be made to the first embodiment.
In the first embodiment of the present disclosure, a server is used as an execution body, as shown in fig. 7, and in the third embodiment of the present disclosure, a terminal is used as an execution body, and the data processing method in the present embodiment includes:
s301: source data is received, and target class data associated with the source data is determined.
S303: and determining and displaying exploration data according to the target class data.
S305: receiving feedback data for the exploration data, and determining a selected category in the target category data according to the feedback data;
s307: target data under the selected category is determined and presented.
In this embodiment, the terminal may be used as an execution body, and may not interact with the server, and the rules, principles, standards, etc. according to this embodiment may be referred to the first embodiment, and details not described in detail in this embodiment may be referred to the first embodiment
As shown in fig. 8, a fourth embodiment of the present specification provides a data processing apparatus including:
a target class determining module 401, configured to receive source data sent by a terminal, and determine target class data associated with the source data;
the exploration module 403 is configured to send the target class data to the terminal, so that the terminal determines and displays exploration data according to the target class data;
a feedback module 405, configured to receive feedback data for the exploration data sent by the terminal, and determine a selected category in the target category data according to the feedback data;
the target data determining module 407 is configured to determine and send target data under the selected category to the terminal.
Optionally, determining the target class data associated with the source data includes:
determining classification number data corresponding to the source data;
and determining target category data associated with the source data according to the classification number data.
Optionally, the classification number data is a middle graph classification number.
Optionally, determining the class number data corresponding to the source data includes:
after receiving source data, determining knowledge points corresponding to the source data, and determining the class numbers of the middle graphs corresponding to the source data according to the corresponding relation between the knowledge points and the class numbers of the middle graphs.
Optionally, the apparatus further includes:
the knowledge point module is used for constructing a knowledge point database and determining the class number of the middle graph corresponding to each knowledge point.
Optionally, constructing the knowledge point database includes:
acquiring network data, and processing the network data by using a regular expression and a bargaining word;
and carrying out named entity recognition and data cleaning on the processed data, and determining knowledge points.
Optionally, performing data cleansing on the data includes:
and carrying out data consistency processing and/or missing value processing and/or outlier processing and/or deduplication processing on the data.
Optionally, named entity recognition is performed on the processed data through Stanford NLP.
Optionally, determining the target category data associated with the source data according to the category number data includes:
determining associated category data with the source data according to the corresponding relation between the classification number and the category data;
determining the association degree of the source data and the association category data;
and determining target category data according to the association degree.
Optionally, mutual information of the source data and the association category data is determined, and the association degree is determined according to the mutual information.
As shown in fig. 9, a fifth embodiment of the present specification provides a data processing apparatus including:
A source data sending module 501, configured to send source data to a server, so that the server determines target class data associated with the source data;
the exploration data module 503 is configured to receive the target class data sent by the server, and determine and display exploration data according to the target class data;
a feedback module 505, configured to determine and send feedback data for the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data, and determines target data under the selected category;
and the display module 507 is configured to receive the target data sent by the server, and display the target data.
As shown in fig. 10, a sixth embodiment of the present specification provides a data processing apparatus including:
a target data determining module 601, configured to receive source data and determine target class data associated with the source data;
the exploration module 603 is configured to determine and display exploration data according to the target category data;
a feedback module 605, configured to receive feedback data for the exploration data, and determine a selected category in the target category data according to the feedback data;
The target data determining module 607 is configured to determine and display target data under the selected category.
A seventh embodiment of the present specification provides a data processing apparatus including:
at least one processor; and a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of the first embodiment.
An eighth embodiment of the present specification provides a data processing apparatus including:
at least one processor; and a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of the second embodiment.
A ninth embodiment of the present specification provides a data processing apparatus including:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
Wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of the third embodiment.
A tenth embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the data processing method in the first embodiment.
An eleventh embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method in the second embodiment.
A twelfth embodiment of the present specification provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the data processing method in the third embodiment.
The above embodiments may be used in combination.
The foregoing describes certain embodiments of the present disclosure, other embodiments being within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. Furthermore, the processes depicted in the accompanying drawings do not necessarily have to be in the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-transitory computer readable storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to portions of the description of method embodiments being relevant.
The apparatus, the device, the nonvolatile computer readable storage medium and the method provided in the embodiments of the present disclosure correspond to each other, and therefore, the apparatus, the device, and the nonvolatile computer storage medium also have similar advantageous technical effects as those of the corresponding method, and since the advantageous technical effects of the method have been described in detail above, the advantageous technical effects of the corresponding apparatus, device, and nonvolatile computer storage medium are not described herein again.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming Language, which is called Hardware Description Language (HDL), but HDL is not only one, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware DescrIP address extension), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware DescrIP address extension), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware DescrIP address extension), etc., VHDL (Very-High-Speed Integrated Circuit Hardware DescrIP address extension) and Verilog) are most commonly used at present. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchIP address PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that the present description may be provided as a method, system, or computer program product. Accordingly, the present specification embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description embodiments may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing description is by way of example only and is not intended as limiting the application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (20)

1. A data processing method, comprising:
receiving source data sent by a terminal, and determining target class data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
The target class data is sent to the terminal, so that the terminal determines and displays exploration data according to the target class data; the exploring data is the target class data or the data obtained after the target class data is processed;
receiving feedback data aiming at the exploration data and sent by a terminal, and determining a selected category in the target category data according to the feedback data;
and determining and sending the target data under the selected category to the terminal.
2. The method of claim 1, determining target category data associated with the source data comprises:
determining classification number data corresponding to the source data;
and determining target category data associated with the source data according to the classification number data.
3. The method of claim 2, wherein the class number data is a class number of a medium graph.
4. The method of claim 1, the method further comprising:
and constructing a knowledge point database, and determining the class number of the middle graph corresponding to each knowledge point.
5. The method of claim 4, constructing a knowledge point database comprising:
acquiring network data, and processing the network data by using a regular expression and a bargaining word;
And carrying out named entity recognition and data cleaning on the processed data, and determining knowledge points.
6. The method of claim 5, wherein the data cleansing comprises:
and carrying out data consistency processing and/or missing value processing and/or outlier processing and/or deduplication processing on the data.
7. The method of claim 5, wherein named entity recognition is performed on the processed data by Stanford NLP.
8. The method of claim 2, wherein determining the target class data associated with the source data from the class number data comprises:
determining associated category data with the source data according to the corresponding relation between the classification number and the category data;
determining the association degree of the source data and the association category data;
and determining target category data according to the association degree.
9. The method of claim 8, determining mutual information of the source data and the association category data, and determining the association degree based on the mutual information.
10. A data processing method, comprising:
transmitting source data to a server to enable the server to determine target class data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
Receiving the target class data sent by a server, and determining and displaying exploration data according to the target class data; the exploring data is the target class data or the data obtained after the target class data is processed;
determining and transmitting feedback data for the exploration data to the server, so that the server determines a selected category in the target category data according to the feedback data, and determines target data under the selected category;
and receiving the target data sent by the server and displaying the target data.
11. A data processing method, comprising:
receiving source data, and determining target class data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
determining and displaying exploration data according to the target class data; the exploring data is the target class data or the data obtained after the target class data is processed;
Receiving feedback data for the exploration data, and determining a selected category in the target category data according to the feedback data;
target data under the selected category is determined and presented.
12. A data processing apparatus comprising:
the system comprises a target category determining module, a target category determining module and a target category determining module, wherein the target category determining module is used for receiving source data sent by a terminal and determining target category data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
the exploration module is used for sending the target category data to the terminal so that the terminal can determine and display exploration data according to the target category data; the exploring data is the target class data or the data obtained after the target class data is processed;
the feedback module is used for receiving feedback data aiming at the exploration data and sent by the terminal, and determining a selected category in the target category data according to the feedback data;
And the target data determining module is used for determining and sending the target data under the selected category to the terminal.
13. A data processing apparatus comprising:
the system comprises a source data sending module, a server and a storage module, wherein the source data sending module is used for sending source data to the server so that the server can determine target category data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
the exploration data module is used for receiving the target category data sent by the server, and determining and displaying exploration data according to the target category data; the exploring data is the target class data or the data obtained after the target class data is processed;
the feedback module is used for determining and sending feedback data aiming at the exploration data to the server so that the server determines a selected category in the target category data according to the feedback data and determines target data under the selected category;
And the display module is used for receiving the target data sent by the server and displaying the target data.
14. A data processing apparatus comprising:
the system comprises a target data determining module, a target data processing module and a target data processing module, wherein the target data determining module is used for receiving source data and determining target category data associated with the source data; the determining target class data associated with the source data includes: determining knowledge points corresponding to the source data, determining a middle graph class number corresponding to the source data according to the corresponding relation between the knowledge points and the middle graph class number, and determining target class data associated with the source data according to the middle graph class number;
the exploration module is used for determining and displaying exploration data according to the target category data;
the feedback module is used for receiving feedback data aiming at the exploration data and determining a selected category in the target category data according to the feedback data; the exploring data is the target class data or the data obtained after the target class data is processed;
and the target data determining module is used for determining and displaying the target data under the selected category.
15. A data processing apparatus comprising:
at least one processor;
The method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of any one of claims 1 to 9.
16. A data processing apparatus comprising:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of claim 10.
17. A data processing apparatus comprising:
at least one processor;
the method comprises the steps of,
a memory communicatively coupled to the at least one processor;
wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing method of claim 11.
18. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the data processing method of any one of claims 1 to 9.
19. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the data processing method of claim 10.
20. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the data processing method of claim 11.
CN202010010403.XA 2020-01-06 2020-01-06 Data processing method, device, equipment and medium Active CN111209277B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010010403.XA CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010010403.XA CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111209277A CN111209277A (en) 2020-05-29
CN111209277B true CN111209277B (en) 2023-11-24

Family

ID=70788605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010010403.XA Active CN111209277B (en) 2020-01-06 2020-01-06 Data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111209277B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112000694B (en) * 2020-09-11 2024-04-26 支付宝(杭州)信息技术有限公司 Data acquisition method, device, equipment and medium
CN114021709B (en) * 2021-09-30 2024-01-23 苏州浪潮智能科技有限公司 Multi-FPGA data processing method and device, server and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739843A (en) * 2014-12-08 2016-07-06 阿里巴巴集团控股有限公司 Information display method and apparatus as well as electronic device
CN108694183A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of search method and device
CN108984737A (en) * 2018-07-16 2018-12-11 北京全聘致远科技有限公司 Resume search method and device
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device
CN109558508A (en) * 2018-10-22 2019-04-02 百度在线网络技术(北京)有限公司 Data digging method, device, computer equipment and storage medium
CN109614415A (en) * 2018-09-29 2019-04-12 阿里巴巴集团控股有限公司 A kind of data mining, processing method, device, equipment and medium
CN109740085A (en) * 2019-01-10 2019-05-10 北京字节跳动网络技术有限公司 A kind of methods of exhibiting of content of pages, device, equipment and storage medium
CN109801204A (en) * 2018-08-07 2019-05-24 福州米鱼信息科技有限公司 A kind of personal academic service system and its implementation
CN109871483A (en) * 2019-01-22 2019-06-11 珠海天燕科技有限公司 A kind of determination method and device of recommendation information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10691762B2 (en) * 2015-10-05 2020-06-23 Fujitsu Limited Method of outputting recommended item and recommended item output device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739843A (en) * 2014-12-08 2016-07-06 阿里巴巴集团控股有限公司 Information display method and apparatus as well as electronic device
CN108694183A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of search method and device
CN108984737A (en) * 2018-07-16 2018-12-11 北京全聘致远科技有限公司 Resume search method and device
CN109801204A (en) * 2018-08-07 2019-05-24 福州米鱼信息科技有限公司 A kind of personal academic service system and its implementation
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device
CN109614415A (en) * 2018-09-29 2019-04-12 阿里巴巴集团控股有限公司 A kind of data mining, processing method, device, equipment and medium
CN109558508A (en) * 2018-10-22 2019-04-02 百度在线网络技术(北京)有限公司 Data digging method, device, computer equipment and storage medium
CN109740085A (en) * 2019-01-10 2019-05-10 北京字节跳动网络技术有限公司 A kind of methods of exhibiting of content of pages, device, equipment and storage medium
CN109871483A (en) * 2019-01-22 2019-06-11 珠海天燕科技有限公司 A kind of determination method and device of recommendation information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
詹萌 ; .我国图书馆书目数据库的数据特征分析与检索方式扩展研究.图书情报工作.2006,(07),全文. *
郝玫 ; 王道平 ; 奚 ; .基于《中国图书馆分类法》的高校图书馆书目推荐服务研究.情报理论与实践.2012,(12),全文. *

Also Published As

Publication number Publication date
CN111209277A (en) 2020-05-29

Similar Documents

Publication Publication Date Title
CN110162796B (en) News thematic creation method and device
CN110457578B (en) Customer service demand identification method and device
US10956469B2 (en) System and method for metadata correlation using natural language processing
CN111209277B (en) Data processing method, device, equipment and medium
CN117075882A (en) Data display method, device, equipment and medium
CN112015569B (en) Message reminding processing method and device
CN111191132B (en) Information recommendation method and device and electronic equipment
CN111046304B (en) Data searching method and device
CN110659406B (en) Searching method and device
CN112182116B (en) Data exploration method and device
CN111967269B (en) Business risk identification method and device and electronic equipment
CN109584088B (en) Product information pushing method and device
CN116662657A (en) Model training and information recommending method, device, storage medium and equipment
US11775493B2 (en) Information retrieval system
CN111752431A (en) Information display method and device
CN110598133A (en) Method, apparatus, electronic device, and computer-readable storage medium for determining an order of search items
CN113312484B (en) Object tag processing method and device
CN117494068B (en) Network public opinion analysis method and device combining deep learning and causal inference
CN117271611B (en) Information retrieval method, device and equipment based on large model
CN117035695B (en) Information early warning method and device, readable storage medium and electronic equipment
CN112214666B (en) Information pushing method, device and system
CN117931672A (en) Query processing method and device applied to code change
CN116595969A (en) Text generation method and device, storage medium and electronic equipment
CN118709766A (en) Remote sensing question answer generation method, device, medium and equipment
CN118378006A (en) Evaluation method, device and equipment of traceability data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 301, 3rd Floor, Building 1, No. 1 Xidawang Road, Chaoyang District, Beijing, 100000

Applicant after: Beijing Mind Creation Information Technology Co.,Ltd.

Address before: Room 2802, 24 / F, building 4, 89 Jianguo Road, Chaoyang District, Beijing

Applicant before: Beijing Mind Creation Information Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant