CN106682044A - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN106682044A CN106682044A CN201510767682.3A CN201510767682A CN106682044A CN 106682044 A CN106682044 A CN 106682044A CN 201510767682 A CN201510767682 A CN 201510767682A CN 106682044 A CN106682044 A CN 106682044A
- Authority
- CN
- China
- Prior art keywords
- data
- target data
- screening
- target
- website
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses data processing method and device, relates to the technical field of Internet, and mainly aims to reduce data screening time and increase data screening accuracy. According to the technical scheme, the method includes: extracting target data from to-be-processed data, wherein the target data comprises data attributes; caching the target data into preset favorites; responding to a data screening instruction to screen the target data in the preset favorites according to the data attributes to obtain screened target data; displaying the screened target data. The method is mainly applicable to data screening.
Description
Technical field
The present invention relates to Internet technical field, more particularly to a kind of method and device of data processing.
Background technology
With developing rapidly for network, WWW becomes the carrier of mass data, how to efficiently extract
And become a huge challenge using these data.It is have that valid data are filtered out in mass data
Effect utilizes the one of which implementation of internet data.
Generally, in garbled data, according to the actual demand locking data source to data, the data source
The usually webpage in website, then the data in data source are crawled by crawlers, will climb
The data got be stored according to certain mode it is standby in database, when garbled data is needed, adjust
With the data in the database and screened, by the data preparation for filtering out for data report form,
To realize the effectively utilizes to mass data.
, in garbled data by the way, there are the following problems to find it for inventor:To data
When data in storehouse are screened, need to screen the total data in database successively, if should
Data volume in database is larger, then the time of a large amount of garbled datas can be taken during garbled data,
And the accuracy of garbled data is relatively low;Meanwhile, if being screened during garbled data based on database
Process interrupt, then need to screen the total data in database again, screens before interruption
Data cannot retain, and cause the time consumed during garbled data excessive.
The content of the invention
In view of this, the method and device of a kind of data processing that the present invention is provided, main purpose is
The holding time for reducing garbled data and the accuracy for improving garbled data.
In order to solve the above problems, present invention generally provides following technical scheme:
On the one hand, the invention provides a kind of method of data processing, the method includes:
Target data is extracted from pending data;Wherein, the target data includes data attribute value;
The target data is cached in default collection;
In response to garbled data instruction, according to the data attribute value to the mesh in the default collection
Mark data are screened, with the target data after being screened;
Target data after the screening is shown.
On the other hand, the present invention also provides a kind of processing meanss of data, and the device includes:
Extraction unit, for extracting target data from pending data;Wherein, the target data
Comprising data attribute value;
Buffer unit, for the target data that the extraction unit is extracted to be cached in into default collection
In folder;
Screening unit, for instructing in response to garbled data, according to the data attribute value to described slow
The target data that memory cell is buffered in the default collection is screened, with the mesh after being screened
Mark data;
Display unit, for being shown to the target data after screening unit screening.
By above-mentioned technical proposal, the technical scheme that the present invention is provided at least has following advantages:
The method and device of the data processing that the present invention is provided, extracts first target from pending data
Data, the wherein target data include data attribute value, by target data caching and the default receipts extracted
Hide in folder, in response to garbled data instruction, according to the data attribute value of target data to presetting collection
Interior target data is screened, after the target data after being screened, to the target data after screening
It is shown;Directly treat from initialized data base compared with garbled data carries out screening with prior art,
The present invention can be cached in the target data extracted from pending data in default collection, to contract
The data volume of little data to be screened, so as to reduce the holding time of screening target data;Meanwhile, by
It is inversely proportional to the degree of accuracy of screening target data in the data volume of target data, i.e. the data of target data
Amount is less, and the degree of accuracy for screening target data is higher, and the data of the target data in default collection
Amount is less, this improves the accuracy of screening target data.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the present invention's
Technological means, and being practiced according to the content of specification, and in order to allow the above-mentioned of the present invention and
Other objects, features and advantages can become apparent, below especially exemplified by the specific embodiment of the present invention.
Description of the drawings
By the detailed description for reading hereafter preferred embodiment, various other advantage and benefit for
Those of ordinary skill in the art will be clear from understanding.Accompanying drawing is only used for illustrating the mesh of preferred embodiment
, and it is not considered as limitation of the present invention.And in whole accompanying drawing, with identical with reference to symbol
Number represent identical part.In the accompanying drawings:
Fig. 1 shows a kind of flow chart of the method for data processing provided in an embodiment of the present invention;
Fig. 2 shows a kind of composition frame chart of the device of data processing provided in an embodiment of the present invention;
Fig. 3 shows the composition frame chart of the device of another kind of data processing provided in an embodiment of the present invention.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing in accompanying drawing
The exemplary embodiment of the disclosure is shown, it being understood, however, that may be realized in various forms the disclosure
And should not be limited by embodiments set forth here.On the contrary, there is provided these embodiments are able to more
Thoroughly understand the disclosure, and can be by the complete technology for conveying to this area of the scope of the present disclosure
Personnel.
The embodiment of the present invention provides a kind of method of data processing, as shown in figure 1, the method includes:
101st, target data is extracted from pending data.
In the embodiment of the present invention, before to screening target data, first, internet target net is obtained
The data of correspondence webpage in standing, and the pending data for obtaining is stored in initialized data base, so as to
Target data is extracted from the initialized data base;The number of correspondence webpage in internet target website is obtained
According to when, to be determined according to the different type of pending data need to obtain correspondence webpage in which website
Content, the data type of pending data can be:Economic class data, video class data, scientific and technological class
Data etc..The embodiment of the present invention is not entered to the content such as data type, specific targeted website screened
Row is limited.
Generally, pending data is stored in initialized data base, treating in needing to initialized data base
When processing data is screened, target data is extracted from pending data first, wherein, the mesh
Mark packet value containing data attribute;The data attribute value is the data class of target data, for example,
Target data is automotive-type data, military class data, scientific and technological class data can be believed by data attribute
Breath makes a distinction.
As another kind of implementation of the embodiment of the present invention, data mode mark is also included in target data
Know, the data mode is designated when appearance interruption during target data is extracted from pending data,
In the corresponding data mode mark for interrupting addition at node of pending data, so as to from data mode mark
Know and to continue to treat extract in processing data target data, save from pending middle extraction target data
Time, and then improve screening target data holding time.
In the embodiment of the present invention is embodied as, when extracting target data from pending data, it is based on
Default screening conditions extract target data from pending data, and the default screening conditions are artificially to set
The screening conditions put, when default screening conditions are arranged, need to obtain pending data with from targeted website
Screening conditions it is corresponding, can arrange default screening conditions with from targeted website obtain pending data
Screening conditions are consistent;Or, it is also possible to the screening scope for arranging default screening conditions is less than from target network
The corresponding screening scope of acquisition pending data screening conditions of standing.
Exemplary, if the screening conditions for obtaining pending data from targeted website are economic class data,
Then preset screening conditions and could be arranged to stock, security, finance etc.;The embodiment of the present invention is to default
The setting of screening conditions is not defined, but will be according to the actual demand of extraction target data to default
Screening conditions are configured.
102nd, the target data is cached in default collection.
Initialized data base in step 101 is used to store pending data, but in the initialized data base
The type of the pending data of storage is more, and coverage is wider, therefore, it is preset in order to reduce
The coverage of the pending data in database, there is provided the accuracy of screening target data, will extract
Target data afterwards is cached in default collection, wherein, the default collection is used to store number of targets
According to, and the data volume of the pending data of the initialized data base is less than target in the default collection
The data volume of data.
103rd, instruct in response to garbled data, according to the data attribute value in the default collection
Target data screened.
Garbled data is instructed for screening to target data from default collection, to number of targets
According to when being screened target data is screened according to data attribute value.According to the data attribute value
Target data in the default collection is screened it is intended that being preset target in collection
The data volume of data is less than the data volume of pending data in initialized data base, can save screening target
The holding time of data;Secondly, when the result to screening target data is dissatisfied, can be from default
Target data is screened in collection, the data volume in its default collection is little, can lift sieve
Select the accuracy of target data.
104th, the target data after the screening is shown.
Target data after screening is shown, to be checked to the target data after screening,
Use.
As a kind of implementation of the embodiment of the present invention, in the target data after showing screening, will
Target data after screening is classified, the output display target data in the form of classification;As this
Another kind of implementation of bright embodiment, the target data after screening is summed up, and output display is returned
The target data of the General Logistics Department.The embodiment of the present invention is not entered to the concrete form for showing the target data after screening
Row is limited.
The method of data processing provided in an embodiment of the present invention, extracts first target from pending data
Data, the wherein target data include data attribute value, by target data caching and the default receipts extracted
Hide in folder, in response to garbled data instruction, according to the data attribute value of target data to presetting collection
Interior target data is screened, after the target data after being screened, to the target data after screening
It is shown.
Specifically, directly treat from initialized data base compared with garbled data carries out screening with prior art,
The target data extracted from pending data can be cached in default collection by the embodiment of the present invention
It is interior, to reduce the data volume of data to be screened, so as to reduce the holding time of screening target data;
Simultaneously as the data volume of target data is inversely proportional to the degree of accuracy of screening target data, i.e. number of targets
According to data volume it is less, the degree of accuracy for screening target data is higher, and the number of targets in default collection
According to data volume it is less, this improves screening target data accuracy.
It should be noted that step 103 provided by the present invention can directly using in target data
Data attribute value carries out first time screening;The target can also be recycled after this is screened for the first time
The property value of data determines the influence power of targeted website, and then further according to the influence power pair of the targeted website
The target data carries out programmed screening;Obviously this programme can also first pass through the property value of target data
Determine targeted website influence power, then recycle the influence power of determined targeted website to come to target
Data are screened, and to this present invention any restriction is not done.
Further, as the refinement and extension to above-described embodiment, above-mentioned steps 103 are being performed
When screening to the target data in the default collection according to the data attribute value, can adopt
Following manner:
First, the website influence power of targeted website is obtained according to the data attribute value of target data;Then,
Class indication is carried out using the website influence power to presetting the target data in collection;Finally, according to
The class indication is screened to presetting the target data in collection;Wherein, targeted website is this
The data source of target data described in bright embodiment;I.e. the target data is obtained from the targeted website,
Website influence power is according to targeted website ownership place mark, targeted website ranking and the concern to targeted website
Degree is constituted, and the default attention rate of the mainstream media is by default website visiting amount and default website visiting ranking
It is determined that.
As a kind of implementation of the embodiment of the present invention, by the target data in default collection according to
Class indication is stored;Or, as another kind of implementation of the embodiment of the present invention, according only to
Website influence power carries out class indication to presetting the target data in collection, and is not entered with class indication
Row storage, but in output display data to be screened, shown according to class indication.
In order to more clearly illustrate according to the website influence power of targeted website to presetting the mesh in collection
Mark data carry out class indication, below will illustrate in exemplary fashion.
Exemplary, as shown in table 1, table 1 shows default collection provided in an embodiment of the present invention
The schematic diagram of storage target data.Data source shown in table 1 is the network address of targeted website, its correspondence
Website influence power weaken successively, therefore, in output display target data, website shadow can be based on
The size for ringing power is shown.The exemplary only citing of table 1, the embodiment of the present invention is to default collection
The concrete form of folder storage target data is not defined.
Table 1
It should be noted that being carried out point to presetting the target data in collection according to website influence power
When class is identified, website influence power is bigger, and its authority is higher, illustrates the target obtained from the website
Data are more representative, and the value of the target data is bigger;Website influence power is less, its
It is authoritative lower, illustrate that the value of the target data of acquisition from website is less.
In the embodiment of the present invention, class indication is carried out to presetting the target data in collection, its purpose
It is more accurately target data to be screened, target data is carried out based on class indication
Mark, identifies significance level, data category of mark target data of target data etc..So as to
Output display screening after target data when shown according to class indication.
In actual applications, when the target data in default collection carries out class indication, also may be used
With the experience based on user, to presetting the target data in collection class indication, class indication are carried out
When can be including but not limited to herein below, such as:It is important, more important, can delete etc., but
When being to carry out class indication to presetting the target data in collection based on this kind of mode, user is depended on
Experience, because the experience of user has differences, and causes the target data in collection to carry out
Also there is difference in class indication;Specifically, the embodiment of the present invention is not defined to this.
Further, during default collection is screened to target data, if occurring interrupting,
Then the interruption it is corresponding interrupt node at interpolation data status indicator, to be identified according to data mode
Continue to be screened to presetting the target data in collection.
For example, when default collection memory storage target data is usually sequential storage, can interrupt
At node after interpolation data status indicator, when according to data attribute value to preset collection in target
Before data are screened, can detect first with the presence or absence of data mode mark in the default collection,
If there is data mode mark, start to continue to presetting the mesh in collection from the data mode mark
Mark data are screened, rather than target data is screened again from the starting position of default collection,
Save the time that screening target data takes;If there is no data mode mark in default collection,
Then target data can be screened from the starting position of default collection.
Further, when the target data after to the screening is shown, according to the contingency table
Know and show the target data after screening, so that user enters according to class indication to the target data after screening
Row effectively utilizes.
Further, before target data is extracted from pending data, based on crawlers from mesh
Mark website obtains pending data, and the pending data of acquisition is stored in initialized data base, with
Target data is extracted in the standby pending data from initialized data base.In the embodiment of the present invention, in base
When crawlers obtain pending data from targeted website, can pass through but be not limited to following
Mode realizes, for example:Crawlers obtain pending according to the mode of depth-first from targeted website
Data;Or, crawlers are obtained according to breadth First or optimal preferential mode from targeted website
Take pending data.The embodiment of the present invention obtains pending data to crawlers from targeted website
Specific implementation is not defined.
Further, as the realization to method shown in above-mentioned Fig. 1, another embodiment of the present invention is also carried
A kind of data processing is supplied, the device embodiment is corresponding with preceding method embodiment, for ease of reading,
This device embodiment is no longer repeated one by one the detail content in preceding method embodiment, but it should
Clearly, the device in the present embodiment can correspond to the full content realized in preceding method embodiment.This
Inventive embodiments provide a kind of device of data processing, as shown in Fig. 2 the device includes:
Extraction unit 21, for extracting target data from pending data;Wherein, the number of targets
According to comprising data attribute value;
Buffer unit 22, the target data for the extraction unit 21 to be extracted is cached in default
In collection;
Screening unit 23, for instructing in response to garbled data, according to the data attribute value to described
The target data that buffer unit 22 is buffered in the default collection is screened, after being screened
Target data;
Display unit 24, is shown for the target data after screening to the screening unit 23.
Further, as shown in figure 3, the screening unit 23, including:
Acquisition module 231, for according to the data attribute value, obtaining the website influence power of targeted website;
Wherein, the targeted website is the source website of the target data, and the website influence power is according to target
Website ownership place mark, targeted website ranking and the attention rate to targeted website determine;
Sort module 232 is right for the website influence power obtained using the acquisition module 231
The target data in the default collection carries out class indication;
Screening module 233, for being preset to described according to the class indication of the sort module 232
The target data in collection is screened.
Further, as shown in figure 3, the screening unit 23 also includes:
Add module 234, for when according to the data attribute value to the mesh in the default collection
When mark data carry out occurring interrupting in screening process, add number at corresponding interruption node in described interruption
According to status indicator, to be continued to the mesh in the default collection according to data mode mark
Mark data are screened.
Further, as shown in figure 3, the display unit 24, is additionally operable to according to the screening unit
The class indication of the target data in 23 shows the target data after the screening.
Further, as shown in figure 3, described device also includes:
Acquiring unit 25, for the extraction unit 21 extract from pending data target data it
Before, obtain the pending data based on crawlers;
Memory cell 26, after obtaining the pending data in the acquiring unit 25, by institute
State pending data to be stored in initialized data base.
The device of data processing provided in an embodiment of the present invention, extracts first target from pending data
Data, the wherein target data include data attribute value, by target data caching and the default receipts extracted
Hide in folder, in response to garbled data instruction, according to the data attribute value of target data to presetting collection
Interior target data is screened, after the target data after being screened, to the target data after screening
It is shown;Directly treat from initialized data base compared with garbled data carries out screening with prior art,
The target data extracted from pending data can be cached in default collection by the embodiment of the present invention
It is interior, to reduce the data volume of data to be screened, so as to reduce the holding time of screening target data;
Simultaneously as the data volume of target data is inversely proportional to the degree of accuracy of screening target data, i.e. number of targets
According to data volume it is less, the degree of accuracy for screening target data is higher, and the number of targets in default collection
According to data volume it is less, this improves screening target data accuracy.
The device of the data processing includes processor and memory, said extracted unit, buffer unit,
Screening unit and display unit etc. are stored in memory as program unit, are deposited by computing device
Storage said procedure unit in memory is realizing corresponding function.
Kernel is included in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can
To arrange one or more, by adjusting kernel parameter holding time and the raising of garbled data are reduced
The accuracy of garbled data.
Memory potentially includes the volatile memory in computer-readable medium, random access memory
The form such as device (RAM) and/or Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash
RAM), memory includes at least one storage chip.
Present invention also provides a kind of computer program, when performing on data processing equipment,
It is adapted for carrying out initializing the program code of there are as below methods step:Number of targets is extracted from pending data
According to;Wherein, the target data includes data attribute value;The target data is cached in into default receipts
Hide in folder;In response to garbled data instruction, according to the data attribute value in the default collection
Target data screened, with the target data after being screened;To the number of targets after the screening
According to being shown.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, certain reality
Apply in example without the part described in detail, may refer to the associated description of other embodiment.
Those skilled in the art it should be appreciated that embodiments herein can be provided as method, system,
Or computer program.Therefore, the application can be implemented using complete hardware embodiment, complete software
Example or with reference to the form of the embodiment in terms of software and hardware.And, the application can be adopted at one
Or it is multiple wherein include computer usable program code computer-usable storage medium (including but not
Be limited to magnetic disc store, CD-ROM, optical memory etc.) on the computer program implemented
Form.
The application is with reference to the method according to the embodiment of the present application, equipment (system) and computer program
The flow chart and/or block diagram of product is describing.It should be understood that can be realized flowing by computer program instructions
In each flow process and/or square frame and flow chart and/or block diagram in journey figure and/or block diagram
Flow process and/or square frame combination.Can provide these computer program instructions to all-purpose computer, specially
With the processor of computer, Embedded Processor or other programmable data processing devices producing one
Machine so that produced by the instruction of computer or the computing device of other programmable data processing devices
It is raw to be used to realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple sides
The device of the function of specifying in frame.
These computer program instructions may be alternatively stored in can guide computer or other programmable datas to process
In the computer-readable memory that equipment works in a specific way so that be stored in the computer-readable and deposit
Instruction in reservoir is produced and includes the manufacture of command device, and command device realization is in flow chart one
The function of specifying in flow process or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions can also be loaded into computer or other programmable data processing devices
On so that series of operation steps is performed on computer or other programmable devices to produce computer
The process of realization, so as to the instruction performed on computer or other programmable devices is provided for realizing
Specify in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames
The step of function.
In a typical configuration, computing device include one or more processors (CPU), input/
Output interface, network interface and internal memory.
Memory potentially includes the volatile memory in computer-readable medium, random access memory
The form such as device (RAM) and/or Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash
RAM).Memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be with
Information Store is realized by any method or technique.Information can be computer-readable instruction, data knot
Structure, the module of program or other data.The example of the storage medium of computer includes, but are not limited to phase
Become internal memory (PRAM), static RAM (SRAM), dynamic random access memory
(DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electricity can
Erasable programmable read-only memory (EPROM) (EEPROM), fast flash memory bank or other memory techniques, read-only light
Disk read-only storage (CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic
Cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus or any other non-transmission medium,
Can be used to store the information that can be accessed by a computing device.Define according to herein, computer-readable
Medium does not include temporary computer readable media (transitory media), the such as data-signal and load of modulation
Ripple.
Also, it should be noted that term " including ", "comprising" or its any other variant are intended to contain
Lid nonexcludability is included, so that process, method, commodity including a series of key elements or setting
It is standby not only to include those key elements, but also including other key elements being not expressly set out, or also wrap
Include the key element intrinsic for this process, method, commodity or equipment.In the feelings without more restrictions
Under condition, the key element limited by sentence "including a ...", it is not excluded that including key element process,
Also there is other identical element in method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or calculating
Machine program product.Therefore, the application can adopt complete hardware embodiment, complete software embodiment or knot
Close the form of the embodiment in terms of software and hardware.And, the application can adopt at one or more it
In include computer-usable storage medium (the including but not limited to disk of computer usable program code
Memory, CD-ROM, optical memory etc.) on implement computer program form.
Embodiments herein is these are only, the application is not limited to.For this area skill
For art personnel, the application can have various modifications and variations.It is all spirit herein and principle it
Interior made any modification, equivalent substitution and improvements etc., should be included in claims hereof model
Within enclosing.
Claims (10)
1. a kind of data processing method, it is characterised in that include:
Target data is extracted from pending data;Wherein, the target data includes data attribute value;
The target data is cached in default collection;
In response to garbled data instruction, according to the data attribute value to the mesh in the default collection
Mark data are screened, with the target data after being screened;
Target data after the screening is shown.
2. method according to claim 1, it is characterised in that according to the data attribute value pair
Target data in the default collection is screened, including:
According to the data attribute value, the website influence power of targeted website is obtained;Wherein, the target
Website is the source website of the target data, the website influence power identifies according to targeted website ownership place,
Targeted website ranking and the attention rate to targeted website determine;
Using the website influence power, the target data in the default collection is classified
Mark;
The target data in the default collection is screened according to the class indication.
3. method according to claim 1 and 2, it is characterised in that according to the data attribute
Value carries out screening to the target data in the default collection to be included:
If carrying out screening to the target data in the default collection according to the data attribute value
Occur interrupting in journey, then interrupt interpolation data status indicator at corresponding interruption node described, so as to
Continue to screen the target data in the default collection according to data mode mark.
4. method according to claim 3, it is characterised in that to the number of targets after the screening
According to be shown including:
Class indication according to target data shows the target data after the screening.
5. method according to claim 4, it is characterised in that extracting from pending data
Before target data, methods described also includes:
The pending data is obtained based on crawlers, and the pending data is stored in preset
In database.
6. a kind of data processing equipment, it is characterised in that include:
Extraction unit, for extracting target data from pending data;Wherein, the target data
Comprising data attribute value;
Buffer unit, for the target data that the extraction unit is extracted to be cached in into default collection
In folder;
Screening unit, for instructing in response to garbled data, according to the data attribute value to described slow
The target data that memory cell is buffered in the default collection is screened, with the mesh after being screened
Mark data;
Display unit, for being shown to the target data after screening unit screening.
7. device according to claim 6, it is characterised in that the screening unit, including:
Acquisition module, for according to the data attribute value, obtaining the website influence power of targeted website;
Wherein, the targeted website is the source website of the target data, and the website influence power is according to target
Website ownership place mark, targeted website ranking and the attention rate to targeted website determine;
Sort module, for the website influence power obtained using the acquisition module, to described pre-
If the target data in collection carries out class indication;
Screening module, for according to the class indication of the sort module to the default collection
The interior target data is screened.
8. the device according to claim 6 or 7, it is characterised in that the screening unit includes:
Add module, for when according to the data attribute value to the target in the default collection
When data carry out occurring interrupting in screening process, described interpolation data at corresponding interruption node is interrupted
Status indicator, to be continued to the target in the default collection according to data mode mark
Data are screened.
9. device according to claim 8, it is characterised in that the display unit, for pressing
Show the target data after the screening according to the class indication of the target data in the screening unit.
10. device according to claim 9, it is characterised in that described device also includes:
Acquiring unit, for before the extraction unit extracts target data from pending data,
The pending data is obtained based on crawlers;
Memory cell, after obtaining the pending data in the acquiring unit, treats described
Processing data is stored in initialized data base.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510767682.3A CN106682044B (en) | 2015-11-11 | 2015-11-11 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510767682.3A CN106682044B (en) | 2015-11-11 | 2015-11-11 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106682044A true CN106682044A (en) | 2017-05-17 |
CN106682044B CN106682044B (en) | 2021-01-15 |
Family
ID=58864867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510767682.3A Active CN106682044B (en) | 2015-11-11 | 2015-11-11 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682044B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590641A (en) * | 2017-08-18 | 2018-01-16 | 北京北信源软件股份有限公司 | A kind of localization method of organization node, system, computer-readable recording medium and storage control |
CN107665234A (en) * | 2017-07-25 | 2018-02-06 | 平安科技(深圳)有限公司 | Method for processing business, device, server and storage medium |
CN107909483A (en) * | 2017-07-25 | 2018-04-13 | 平安科技(深圳)有限公司 | Flow of settling a claim recognition methods, device, server and storage medium |
CN111796513A (en) * | 2019-04-08 | 2020-10-20 | 阿里巴巴集团控股有限公司 | Data processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929985A (en) * | 2012-10-18 | 2013-02-13 | 北京奇虎科技有限公司 | Method and system for displaying collected webpage |
CN103389984A (en) * | 2012-05-08 | 2013-11-13 | 百度在线网络技术(北京)有限公司 | Method and device for providing collection association information in search results |
CN104965884A (en) * | 2015-06-15 | 2015-10-07 | 广东欧珀移动通信有限公司 | File collection method and related terminal |
US20150287092A1 (en) * | 2014-04-07 | 2015-10-08 | Favored.By | Social networking consumer product organization and presentation application |
-
2015
- 2015-11-11 CN CN201510767682.3A patent/CN106682044B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103389984A (en) * | 2012-05-08 | 2013-11-13 | 百度在线网络技术(北京)有限公司 | Method and device for providing collection association information in search results |
CN102929985A (en) * | 2012-10-18 | 2013-02-13 | 北京奇虎科技有限公司 | Method and system for displaying collected webpage |
US20150287092A1 (en) * | 2014-04-07 | 2015-10-08 | Favored.By | Social networking consumer product organization and presentation application |
CN104965884A (en) * | 2015-06-15 | 2015-10-07 | 广东欧珀移动通信有限公司 | File collection method and related terminal |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107665234A (en) * | 2017-07-25 | 2018-02-06 | 平安科技(深圳)有限公司 | Method for processing business, device, server and storage medium |
CN107909483A (en) * | 2017-07-25 | 2018-04-13 | 平安科技(深圳)有限公司 | Flow of settling a claim recognition methods, device, server and storage medium |
WO2019019621A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Service processing method, device, server and storage medium |
CN107665234B (en) * | 2017-07-25 | 2020-07-28 | 平安科技(深圳)有限公司 | Service processing method, device, server and storage medium |
CN107909483B (en) * | 2017-07-25 | 2021-05-04 | 平安科技(深圳)有限公司 | Claims settlement flow identification method, device, server and storage medium |
CN107590641A (en) * | 2017-08-18 | 2018-01-16 | 北京北信源软件股份有限公司 | A kind of localization method of organization node, system, computer-readable recording medium and storage control |
CN111796513A (en) * | 2019-04-08 | 2020-10-20 | 阿里巴巴集团控股有限公司 | Data processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN106682044B (en) | 2021-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104391951B (en) | The loading method and device of webpage thermodynamic | |
CN104572668B (en) | Method and apparatus based on multiple pattern file generated Merge Styles files | |
CN106682044A (en) | Data processing method and device | |
WO2020253351A1 (en) | Click hijacking vulnerability detection method, device and computer apparatus | |
CN102077201A (en) | System and method for dynamic and real-time categorization of webpages | |
CN103617241B (en) | Search information processing method, browser terminal and server | |
CN103544313B (en) | Data processing method and device for webpage recommending | |
CN107766469A (en) | A kind of method for caching and processing and device | |
CN106570025A (en) | Data filtering method and device | |
CN103530390B (en) | The method and apparatus of webpage capture | |
CN112835682B (en) | Data processing method, device, computer equipment and readable storage medium | |
CN103984743A (en) | Method and device for managing memory resources | |
CN106886547A (en) | A kind of scenario generation method and device | |
CN103064849B (en) | Treatment method and device for cascading style sheet (CSS) | |
CN107015986A (en) | A kind of reptile crawls the method and device of webpage | |
CN106020891A (en) | Page loading method and device | |
CN105376311A (en) | Method and device for determining page stay duration based on terminal access | |
WO2017086992A1 (en) | Malicious web content discovery through graphical model inference | |
CN110008393A (en) | It is a kind of for obtaining the method and apparatus of site information | |
CN109766488A (en) | A kind of collecting method based on Scrapy | |
CN105069135B (en) | The data crawling method and system of the website OTA | |
CN110020297A (en) | A kind of loading method of web page contents, apparatus and system | |
CN110147473A (en) | A kind of crawling method and device of crawler | |
CN108062326A (en) | A kind of update recording method of data message and device | |
CN106817355A (en) | The control method and device of webpage authority |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: Beijing Guoshuang Technology Co.,Ltd. Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing Applicant before: Beijing Guoshuang Technology Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |