CN106682044B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN106682044B
CN106682044B CN201510767682.3A CN201510767682A CN106682044B CN 106682044 B CN106682044 B CN 106682044B CN 201510767682 A CN201510767682 A CN 201510767682A CN 106682044 B CN106682044 B CN 106682044B
Authority
CN
China
Prior art keywords
data
target data
target
screening
website
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510767682.3A
Other languages
Chinese (zh)
Other versions
CN106682044A (en
Inventor
刘嘉
钦滨杰
陈晓敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510767682.3A priority Critical patent/CN106682044B/en
Publication of CN106682044A publication Critical patent/CN106682044A/en
Application granted granted Critical
Publication of CN106682044B publication Critical patent/CN106682044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device, relates to the technical field of internet, and mainly aims to reduce the occupied time of screening data and improve the accuracy of screening data. The main technical scheme of the invention comprises the following steps: extracting target data from data to be processed; wherein the target data comprises data attribute values; caching the target data in a preset favorite; in response to a data screening instruction, screening target data in the preset favorites according to the data attribute values to obtain screened target data; and displaying the screened target data. The invention is mainly applied to the data screening process.

Description

Data processing method and device
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data processing method and apparatus.
Background
With the rapid growth of networks, the world wide web has become the carrier of large amounts of data, and how to efficiently extract and utilize such data has become a significant challenge. Screening out valid data from mass data is one implementation way of effectively utilizing internet data.
Generally, when screening data, a data source is locked according to actual requirements for the data, the data source is usually a webpage in a website, data in the data source is crawled through a crawler program, the crawled data is stored in a database for later use in a certain mode, when the data needs to be screened, the data in the database is called and screened, and the screened data is arranged into a data report mode to achieve effective utilization of mass data.
When the inventors screened data in the above manner, they found that the following problems existed: when the data in the database is screened, all the data in the database needs to be screened in sequence, if the data volume in the database is large, a large amount of data screening time is occupied in the data screening process, and the accuracy of the screened data is low; meanwhile, if the screening process is interrupted in the process of screening data based on the database, all data in the database needs to be screened again, and the screened data cannot be reserved before interruption, so that too much time is consumed in screening the data.
Disclosure of Invention
In view of this, the present invention provides a method and an apparatus for data processing, which mainly aims to reduce the occupied time of screening data and improve the accuracy of screening data.
In order to solve the above problems, the present invention mainly provides the following technical solutions:
in one aspect, the present invention provides a method for data processing, including:
extracting target data from data to be processed; wherein the target data comprises data attribute values;
caching the target data in a preset favorite;
in response to a data screening instruction, screening target data in the preset favorites according to the data attribute values to obtain screened target data;
and displaying the screened target data.
In another aspect, the present invention further provides a data processing apparatus, including:
an extraction unit for extracting target data from data to be processed; wherein the target data comprises data attribute values;
the cache unit is used for caching the target data extracted by the extraction unit in a preset favorite;
the screening unit is used for responding to a data screening instruction and screening the target data cached in the preset favorites by the cache unit according to the data attribute value to obtain the screened target data;
and the display unit is used for displaying the target data screened by the screening unit.
By the technical scheme, the technical scheme provided by the invention at least has the following advantages:
the method and the device for processing the data provided by the invention firstly extract target data from the data to be processed, wherein the target data comprises a data attribute value, the extracted target data is cached in a preset favorite, the target data in the preset favorite is screened according to the data attribute value of the target data in response to a screening data instruction, and the screened target data is displayed after the screened target data is obtained; compared with the prior art that the data to be screened is directly screened from the preset database, the method can cache the target data extracted from the data to be screened in the preset favorite so as to reduce the data volume of the data to be screened, thereby reducing the occupied time for screening the target data; meanwhile, the data volume of the target data is inversely proportional to the accuracy of the screened target data, namely the smaller the data volume of the target data is, the higher the accuracy of the screened target data is, and the smaller the data volume of the target data in the preset favorite is, so that the accuracy of the screened target data is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart illustrating a method of data processing provided by an embodiment of the present invention;
FIG. 2 is a block diagram illustrating components of an apparatus for data processing according to an embodiment of the present invention;
fig. 3 is a block diagram illustrating another data processing apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
An embodiment of the present invention provides a data processing method, as shown in fig. 1, the method includes:
101. target data is extracted from the data to be processed.
In the embodiment of the invention, before screening target data, firstly, data of a corresponding webpage in an internet target website is obtained, and the obtained data to be processed is stored in a preset database so as to extract the target data from the preset database; when data of corresponding web pages in an internet target website is acquired, determining which website contents corresponding to the web pages need to be acquired according to different types of data to be processed, wherein the data types of the data to be processed can be as follows: economy class data, video class data, technology class data, and the like. The embodiment of the invention does not limit the types of the screened data, the specific contents of the target website and the like.
Generally, data to be processed is stored in a preset database, and when the data to be processed in the preset database needs to be screened, target data is extracted from the data to be processed, wherein the target data comprises a data attribute value; the data attribute value is a data type of the target data, for example, the target data is an automobile type data, a military type data, and a scientific type data, which can be distinguished by the data attribute information.
As another implementation manner of the embodiment of the present invention, the target data further includes a data state identifier, and the data state identifier is a data state identifier added at an interrupt node corresponding to the data to be processed when an interrupt occurs during the process of extracting the target data from the data to be processed, so that the target data is extracted from the data to be processed continuously from the data state identifier, thereby saving the time for extracting the target data from the data to be processed, and further increasing the time for screening the target data.
In the specific implementation of the embodiment of the present invention, when extracting target data from data to be processed, the target data is extracted from the data to be processed based on a preset screening condition, where the preset screening condition is an artificially set screening condition, and when the preset screening condition is set, the preset screening condition may be set to be consistent with the screening condition for acquiring the data to be processed from the target website, and the preset screening condition is required to correspond to the screening condition for acquiring the data to be processed from the target website; or, the screening range of the preset screening condition may be set to be smaller than the screening range corresponding to the screening condition for acquiring the data to be processed from the target website.
For example, if the filtering condition for obtaining the data to be processed from the target website is the economic data, the preset filtering condition may be set as stock, securities, finance, and the like; the embodiment of the invention does not limit the setting of the preset screening conditions, but sets the preset screening conditions according to the actual requirement of extracting the target data.
102. And caching the target data in a preset favorite.
The preset database in step 101 is used for storing the data to be processed, but the types of the data to be processed stored in the preset database are many, and the coverage range is wide, so that in order to reduce the coverage range of the data to be processed in the preset database and provide the accuracy of screening the target data, the extracted target data is cached in a preset favorite, wherein the preset favorite is used for storing the target data, and the data amount of the data to be processed in the preset database is less than that of the target data in the preset favorite.
103. And responding to a data screening instruction, and screening the target data in the preset favorite according to the data attribute value.
The screening data instruction is used for screening the target data from the preset favorites and screening the target data according to the data attribute values when the target data are screened. The target data in the preset favorite is screened according to the data attribute value, so that the data volume of the target data in the preset favorite is smaller than that of the data to be processed in a preset database, and the occupied time for screening the target data can be saved; secondly, when the result of screening the target data is not satisfactory, the target data can be screened from the preset favorites, the data volume in the preset favorites is small, and the accuracy of screening the target data can be improved.
104. And displaying the screened target data.
And displaying the screened target data so as to check and use the screened target data.
As an implementation manner of the embodiment of the present invention, when displaying the screened target data, classifying the screened target data, and outputting and displaying the target data in a category form; as another implementation manner of the embodiment of the present invention, the filtered target data is summarized, and the summarized target data is output and displayed. The embodiment of the invention does not limit the specific form of the displayed and screened target data.
The data processing method provided by the embodiment of the invention comprises the steps of firstly extracting target data from data to be processed, caching the extracted target data in a preset favorite, responding to a data screening instruction, screening the target data in the preset favorite according to the data attribute value of the target data, obtaining the screened target data, and displaying the screened target data.
Specifically, compared with the prior art that the data to be screened is directly screened from the preset database, the embodiment of the invention can cache the target data extracted from the data to be screened in the preset favorite so as to reduce the data volume of the data to be screened, thereby reducing the occupied time for screening the target data; meanwhile, the data volume of the target data is inversely proportional to the accuracy of the screened target data, namely the smaller the data volume of the target data is, the higher the accuracy of the screened target data is, and the smaller the data volume of the target data in the preset favorite is, so that the accuracy of the screened target data is improved.
It should be noted that step 103 provided by the present invention can directly use the data attribute value in the target data to perform the first screening; or after the first screening, determining the influence of the target website by using the attribute value of the target data, and then screening the target data for the second time according to the influence of the target website; obviously, according to the scheme, the influence of the target website can be determined through the attribute value of the target data, and then the determined influence of the target website is utilized to screen the target data, which is not limited in any way.
Further, as a refinement and an extension of the above embodiment, when the step 103 is executed to filter the target data in the preset favorites according to the data attribute values, the following manner may be adopted:
firstly, acquiring the website influence of a target website according to the data attribute value of target data; then, classifying and identifying target data in the preset favorite by using the influence of the website; finally, screening target data in the preset favorites according to the classification identification; the target website is a data source of the target data in the embodiment of the invention; the target data is obtained from the target website, the website influence is formed according to the target website attribution identification, the target website ranking and the attention to the target website, and the preset attention of the mainstream media is determined by the preset website access amount and the preset website access ranking.
As an implementation manner of the embodiment of the invention, target data in a preset favorite is stored according to a classification identifier; or, as another implementation manner of the embodiment of the present invention, the target data in the preset favorite is classified and identified only according to the influence of the website, and is not stored in the classification identifier, but when the data to be filtered is output and displayed, the data is displayed according to the classification identifier.
For more clearly explaining the classification and identification of the target data in the preset favorites according to the website influence of the target website, the following description is provided in an exemplary manner.
Illustratively, as shown in table 1, table 1 shows a schematic diagram of preset favorite storage target data provided by an embodiment of the present invention. The data source shown in table 1 is the website address of the target website, and the influence of the corresponding website is sequentially weakened, so that when the target data is output and displayed, the display can be performed based on the magnitude of the influence of the website. Table 1 is an exemplary example, and the specific form of the preset favorite storage target data is not limited in the embodiment of the present invention.
TABLE 1
Figure BDA0000844533130000061
It should be noted that when target data in a preset favorite is classified and identified according to influence of a website, the larger the influence of the website is, the higher the authority of the website is, which indicates that the target data obtained from the website is more representative, and the higher the utilization value of the target data is; the smaller the influence of the website is, the lower the authority of the website is, and the smaller the utilization value of the target data acquired from the website is.
In the embodiment of the invention, the target data in the preset favorites are classified and identified, so that the target data can be more accurately screened, the target data are identified based on the classified identification, the importance degree of the target data is identified, the data category of the target data is identified, and the like. So as to display according to the classification mark when outputting and displaying the screened target data.
In practical applications, when performing classification and identification on target data in a preset favorite, the target data in the preset favorite may also be classified and identified based on experience of a user, and the classification and identification may include, but is not limited to, the following contents, for example: important, more important, deletable and the like, but when the target data in the preset favorites are classified and identified based on the mode, the target data in the favorites are also classified and identified according to the experience of the user because the experience of the user is different; specifically, the embodiment of the present invention does not limit this.
Further, in the process of screening the target data by the preset favorites, if an interruption occurs, adding a data state identifier at an interruption node corresponding to the interruption so as to continue screening the target data in the preset favorites according to the data state identifier.
For example, when target data stored in a preset favorite is usually stored in sequence, after a data state identifier is added at an interrupt node, before the target data in the preset favorite is screened according to a data attribute value, whether the data state identifier exists in the preset favorite or not can be detected firstly, if the data state identifier exists, the target data in the preset favorite is screened continuously from the data state identifier instead of re-screening the target data from the starting position of the preset favorite, so that the time occupied by screening the target data is saved; if the data state identifier does not exist in the preset favorite, the target data can be screened from the starting position of the preset favorite.
Further, when the screened target data is displayed, the screened target data is displayed according to the classification identifier, so that the user can effectively utilize the screened target data according to the classification identifier.
Further, before extracting the target data from the data to be processed, the data to be processed is obtained from the target website based on a crawler program, and the obtained data to be processed is stored in a preset database so as to extract the target data from the data to be processed in the preset database. In the embodiment of the present invention, when obtaining data to be processed from a target website based on a crawler program, the data to be processed may be implemented by, but not limited to, the following manners, for example: the crawler program acquires data to be processed from a target website in a depth-first mode; or the crawler program acquires the data to be processed from the target website according to a breadth-first or best-first mode. The embodiment of the invention does not limit the specific implementation mode of the crawler program for acquiring the data to be processed from the target website.
Further, as an implementation of the method shown in fig. 1, another embodiment of the present invention further provides a data processing method, where an embodiment of the apparatus corresponds to the foregoing method embodiment, and for convenience of reading, details in the foregoing method embodiment are not repeated in this apparatus embodiment again, but it should be clear that the apparatus in this embodiment can correspondingly implement all the contents in the foregoing method embodiment. An embodiment of the present invention provides a data processing apparatus, as shown in fig. 2, the apparatus includes:
an extraction unit 21 for extracting target data from the data to be processed; wherein the target data comprises data attribute values;
a cache unit 22, configured to cache the target data extracted by the extraction unit 21 in a preset favorite;
the screening unit 23 is configured to, in response to a data screening instruction, screen the target data cached in the preset favorite by the caching unit 22 according to the data attribute value to obtain screened target data;
and a display unit 24, configured to display the target data screened by the screening unit 23.
Further, as shown in fig. 3, the screening unit 23 includes:
an obtaining module 231, configured to obtain, according to the data attribute value, a website influence of the target website; the target website is a source website of the target data, and the influence of the website is determined according to the attribution identification of the target website, the ranking of the target website and the attention degree of the target website;
a classification module 232, configured to perform classification and identification on the target data in the preset favorite by using the website influence obtained by the obtaining module 231;
a screening module 233, configured to screen the target data in the preset favorites according to the classification identifier of the classification module 232.
Further, as shown in fig. 3, the screening unit 23 further includes:
an adding module 234, configured to, when an interruption occurs in a process of screening target data in the preset favorite according to the data attribute value, add a data state identifier at an interruption node corresponding to the interruption, so as to continue to screen the target data in the preset favorite according to the data state identifier.
Further, as shown in fig. 3, the display unit 24 is further configured to display the filtered target data according to the classification identifier of the target data in the filtering unit 23.
Further, as shown in fig. 3, the apparatus further includes:
an acquisition unit 25 configured to acquire the data to be processed based on a crawler before the extraction unit 21 extracts target data from the data to be processed;
a storage unit 26, configured to store the data to be processed in a preset database after the data to be processed is acquired by the acquisition unit 25.
The data processing device provided by the embodiment of the invention firstly extracts target data from data to be processed, wherein the target data comprises a data attribute value, caches the extracted target data in a preset favorite, responds to a data screening instruction, screens the target data in the preset favorite according to the data attribute value of the target data, obtains the screened target data, and then displays the screened target data; compared with the prior art that the data to be screened is directly screened from the preset database, the embodiment of the invention can cache the target data extracted from the data to be screened in the preset favorite so as to reduce the data volume of the data to be screened, thereby reducing the occupied time for screening the target data; meanwhile, the data volume of the target data is inversely proportional to the accuracy of the screened target data, namely the smaller the data volume of the target data is, the higher the accuracy of the screened target data is, and the smaller the data volume of the target data in the preset favorite is, so that the accuracy of the screened target data is improved.
The data processing device comprises a processor and a memory, wherein the extracting unit, the caching unit, the screening unit, the showing unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the occupied time of screening data is reduced and the accuracy of screening data is improved by adjusting kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
The present application further provides a computer program product adapted to perform program code for initializing the following method steps when executed on a data processing device: extracting target data from data to be processed; wherein the target data comprises data attribute values; caching the target data in a preset favorite; in response to a data screening instruction, screening target data in the preset favorites according to the data attribute values to obtain screened target data; and displaying the screened target data.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (8)

1. A method of data processing, comprising:
extracting target data from data to be processed; wherein the target data comprises data attribute values;
caching the target data in a preset favorite;
in response to a data screening instruction, screening target data in the preset favorites according to the data attribute values to obtain screened target data; if interruption occurs in the process of screening the target data in the preset favorite according to the data attribute value, adding a data state identifier at an interruption node corresponding to the interruption so as to continue screening the target data in the preset favorite according to the data state identifier;
displaying the screened target data;
the screening of the target data in the preset favorites according to the data attribute values comprises the following steps:
acquiring the website influence of the target website according to the data attribute value; the target website is a source website of the target data, and the influence of the website is determined according to the attribution identification of the target website, the ranking of the target website and the attention degree of the target website;
classifying and identifying the target data in the preset favorite by utilizing the influence of the website;
and screening the target data in the preset favorite according to the classification identification.
2. The method of claim 1, wherein presenting the filtered target data comprises:
and displaying the screened target data according to the classification identification of the target data.
3. The method of claim 2, wherein prior to extracting target data from the data to be processed, the method further comprises:
and acquiring the data to be processed based on a crawler program, and storing the data to be processed in a preset database.
4. An apparatus for data processing, comprising:
an extraction unit for extracting target data from data to be processed; wherein the target data comprises data attribute values;
the cache unit is used for caching the target data extracted by the extraction unit in a preset favorite;
the screening unit is used for responding to a data screening instruction and screening the target data cached in the preset favorites by the cache unit according to the data attribute value to obtain the screened target data; wherein the screening unit includes: the adding module is used for adding a data state identifier at an interrupt node corresponding to the interrupt when the interrupt occurs in the process of screening the target data in the preset favorite according to the data attribute value so as to continue screening the target data in the preset favorite according to the data state identifier;
the display unit is used for displaying the target data screened by the screening unit;
wherein, screening unit includes:
the acquisition module is used for acquiring the website influence of the target website according to the data attribute value; the target website is a source website of the target data, and the influence of the website is determined according to the attribution identification of the target website, the ranking of the target website and the attention degree of the target website;
the classification module is used for performing classification identification on the target data in the preset favorite by utilizing the website influence acquired by the acquisition module;
and the screening module is used for screening the target data in the preset favorite according to the classification identification of the classification module.
5. The apparatus according to claim 4, wherein the presenting unit is configured to present the screened target data according to the classification identifier of the target data in the screening unit.
6. The apparatus of claim 5, further comprising:
an acquisition unit configured to acquire the data to be processed based on a crawler before the extraction unit extracts target data from the data to be processed;
and the storage unit is used for storing the data to be processed in a preset database after the acquisition unit acquires the data to be processed.
7. A storage medium, characterized in that the storage medium comprises a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the data processing method of any one of claims 1 to 3.
8. A processor, characterized in that the processor is configured to run a program, wherein the program when running performs the method of data processing according to any one of claims 1 to 3.
CN201510767682.3A 2015-11-11 2015-11-11 Data processing method and device Active CN106682044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510767682.3A CN106682044B (en) 2015-11-11 2015-11-11 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510767682.3A CN106682044B (en) 2015-11-11 2015-11-11 Data processing method and device

Publications (2)

Publication Number Publication Date
CN106682044A CN106682044A (en) 2017-05-17
CN106682044B true CN106682044B (en) 2021-01-15

Family

ID=58864867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510767682.3A Active CN106682044B (en) 2015-11-11 2015-11-11 Data processing method and device

Country Status (1)

Country Link
CN (1) CN106682044B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107665234B (en) * 2017-07-25 2020-07-28 平安科技(深圳)有限公司 Service processing method, device, server and storage medium
CN107909483B (en) * 2017-07-25 2021-05-04 平安科技(深圳)有限公司 Claims settlement flow identification method, device, server and storage medium
CN107590641A (en) * 2017-08-18 2018-01-16 北京北信源软件股份有限公司 A kind of localization method of organization node, system, computer-readable recording medium and storage control
CN111796513B (en) * 2019-04-08 2022-09-09 阿里巴巴集团控股有限公司 Data processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929985A (en) * 2012-10-18 2013-02-13 北京奇虎科技有限公司 Method and system for displaying collected webpage
CN103389984A (en) * 2012-05-08 2013-11-13 百度在线网络技术(北京)有限公司 Method and device for providing collection association information in search results
CN104965884A (en) * 2015-06-15 2015-10-07 广东欧珀移动通信有限公司 File collection method and related terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150287092A1 (en) * 2014-04-07 2015-10-08 Favored.By Social networking consumer product organization and presentation application

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103389984A (en) * 2012-05-08 2013-11-13 百度在线网络技术(北京)有限公司 Method and device for providing collection association information in search results
CN102929985A (en) * 2012-10-18 2013-02-13 北京奇虎科技有限公司 Method and system for displaying collected webpage
CN104965884A (en) * 2015-06-15 2015-10-07 广东欧珀移动通信有限公司 File collection method and related terminal

Also Published As

Publication number Publication date
CN106682044A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106682044B (en) Data processing method and device
WO2020253351A1 (en) Click hijacking vulnerability detection method, device and computer apparatus
CN105824830B (en) Method, client and equipment for displaying page
CN108255886B (en) Evaluation method and device of recommendation system
CN107045507B (en) Webpage crawling method and device
CN107038194B (en) Page jump method and device
CN106919620B (en) Single page processing method and device
CN110008393B (en) Method and equipment for acquiring website information
CN107015986B (en) Method and device for crawling webpage by crawler
CN110020291B (en) Webpage layout processing method and device
CN108234392B (en) Website monitoring method and device
CN107704464B (en) Method and device for analyzing path of static resource
CN106911636B (en) Method and device for detecting whether backdoor program exists in website
EP3446236A1 (en) Method and system for providing additional information relating to primary information
CN108460131B (en) Classification label processing method and device
CN111125087B (en) Data storage method and device
CN108984572B (en) Website information pushing method and device
CN110990799A (en) Data processing method, device and system for anti-crawler and storage medium
CN110147183B (en) Data screening method and device
CN106776652B (en) Data processing method and device
CN115659045A (en) User operation identification method and device, storage medium and electronic equipment
CN115297042A (en) Method for detecting consistency of web pages under different networks and related equipment
CN110188301B (en) Information aggregation method and device for website
CN106776654B (en) Data searching method and device
CN106997353B (en) Method and device for monitoring webpage version change

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant