CN107729343A - Resource Access method, computer-readable recording medium and electronic equipment - Google Patents

Resource Access method, computer-readable recording medium and electronic equipment Download PDF

Info

Publication number
CN107729343A
CN107729343A CN201710605505.4A CN201710605505A CN107729343A CN 107729343 A CN107729343 A CN 107729343A CN 201710605505 A CN201710605505 A CN 201710605505A CN 107729343 A CN107729343 A CN 107729343A
Authority
CN
China
Prior art keywords
resource
examination
dimension
extracted
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710605505.4A
Other languages
Chinese (zh)
Inventor
何刘兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN201710605505.4A priority Critical patent/CN107729343A/en
Priority to PCT/CN2018/076524 priority patent/WO2019019619A1/en
Publication of CN107729343A publication Critical patent/CN107729343A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of Resource Access method, computer-readable recording medium and electronic equipment, wherein, above-mentioned Resource Access method includes:It is determined that the type of current resource to be extracted;Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination information instruction includes to carry out the two or more dimension of resource examination in the characteristic set:Default dimension examination information corresponding with various types of resources;According to the dimension examination information currently obtained, the characteristic information of the respective dimensions based on current resource to be extracted in resource collection carries out resource examination, wherein, there is the characteristic information of two or more different dimensions in each resource, the resource collection includes:The resource extracted;If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.Technical scheme provided by the invention can effectively improve the accuracy for distinguishing different resource.

Description

Resource Access method, computer-readable recording medium and electronic equipment
Technical field
The present invention relates to technical field of data processing, and in particular to a kind of Resource Access method, computer-readable storage medium Matter and electronic equipment.
Background technology
With the growing of network application and popularization, increasing resource has been placed on network.And network is climbed Worm is exactly a kind of program for the resource (video, picture, article etc.) that can be captured automatically on internet or LAN.
Traditional web crawlers is to distinguish different resource with the title of resource during searching resource.However, with The sharp increase of resource quantity on platform and website, different resource may be issued in different platform or website with same title, This causes traditional web crawlers to be difficult to accurately distinguish out different resources, and so as to bring, statistics is chaotic, leaks and climbs the problems such as climbing again.
The content of the invention
The present invention provides a kind of Resource Access method, computer-readable recording medium and electronic equipment, is distinguished for improving The accuracy of different resource.
First aspect present invention provides a kind of Resource Access method, including:
It is determined that the type of current resource to be extracted;
Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination Information instruction includes to carry out the two or more dimension of resource examination in the characteristic set:It is default with it is various types of Dimension examination information corresponding to resource,
According to the dimension examination information currently obtained, based on the corresponding of current resource to be extracted in resource collection The characteristic information of dimension carries out resource examination, wherein, the characteristic information of two or more different dimensions, the money be present in each resource Source set includes:The resource extracted;
If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.
Second aspect of the present invention provides a kind of computer-readable recording medium, and above computer readable storage medium storing program for executing is stored with Computer program, when above computer program is by least one computing device, realize following steps:
It is determined that the type of current resource to be extracted;
Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination Information instruction includes to carry out the two or more dimension of resource examination in the characteristic set:It is default with it is various types of Dimension examination information corresponding to resource,
According to the dimension examination information currently obtained, based on the corresponding of current resource to be extracted in resource collection The characteristic information of dimension carries out resource examination, wherein, the characteristic information of two or more different dimensions, the money be present in each resource Source set includes:The resource extracted;
If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.
Third aspect present invention provides a kind of electronic equipment, and above-mentioned electronic equipment includes memory, processor and is stored in On above-mentioned memory and the computer program that can be run on above-mentioned processor, during above-mentioned computing device above computer program Realize following steps:
It is determined that the type of current resource to be extracted;
Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination Information instruction includes to carry out the two or more dimension of resource examination in the characteristic set:It is default with it is various types of Dimension examination information corresponding to resource,
According to the dimension examination information currently obtained, based on the corresponding of current resource to be extracted in resource collection The characteristic information of dimension carries out resource examination, wherein, the characteristic information of two or more different dimensions, the money be present in each resource Source set includes:The resource extracted;
If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.
Therefore for dimension examination information, shape corresponding to the advance structure of different types of resource in the present invention program Into characteristic set, for resource to be extracted, dimension examination letter corresponding with the type of the resource is obtained from this feature set Breath, and the dimension examination information is based on, the characteristic information of the respective dimensions based on current resource to be extracted in resource collection Carry out resource examination.On the one hand, because instruction is to carry out the two or more dimension of resource examination in dimension examination information, relatively In traditional scheme that different resource is only distinguished with title, the accuracy for distinguishing different resource is improved, reduces and repeats extraction phase With the probability of resource;On the other hand, because dimension filter information is related to the type of resource, this to carry out resource examination Dimension specific aim it is stronger, so as to improve the validity of resource examination.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other accompanying drawings according to these accompanying drawings.
Fig. 1 is Resource Access method one embodiment schematic flow sheet provided by the invention;
Fig. 2 is another embodiment schematic flow sheet of Resource Access method provided by the invention;
Fig. 3 is electronic equipment one embodiment structural representation provided by the invention;
Fig. 4 is that the functional module structure of Resource Access program provided by the invention illustrates figure structure schematic representation.
Embodiment
To enable goal of the invention, feature, the advantage of the present invention more obvious and understandable, below in conjunction with the present invention Accompanying drawing in embodiment, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described reality It is only part of the embodiment of the present invention to apply example, and not all embodiments.Based on the embodiment in the present invention, the common skill in this area The every other embodiment that art personnel are obtained under the premise of creative work is not made, belong to the model that the present invention protects Enclose.
Embodiment one
A kind of Resource Access method is described the embodiment of the present invention, referring to Fig. 1, the resource in the embodiment of the present invention Extracting method includes:
Step 101, the type for determining current resource to be extracted;
, can be during web crawlers crawls resource, for each resource to be extracted, automatically in the embodiment of the present invention Perform Resource Access method as shown in Figure 1.Or can also be under other Resource Access scenes, for each to be extracted Resource, Resource Access method as shown in Figure 1 is performed automatically, is not construed as limiting herein.
In a step 101, for current resource to be extracted, the type of the resource is determined.In the embodiment of the present invention, resource Type include but is not limited to it is following several:Audio resource, video resource, textual resources, picture resource etc..Specifically, can be with The type of current resource to be extracted is determined according to the suffix name of resource, or, it can also determine currently to treat otherwise The type of the resource of extraction, for example, according to the source of current resource to be extracted determine current resource to be extracted type (if Source resource is in audio database, it is determined that the type of current resource to be extracted is audio resource, if source resource is in picture Database, it is determined that the type of current resource to be extracted is picture resource, by that analogy), not pair determination of the embodiment of the present invention The specific implementation of the type of current resource to be extracted is defined.
Step 102, dimension examination information corresponding with the above-mentioned type is obtained from default characteristic set;
In the embodiment of the present invention, corresponding dimension examination information is set for various types of resources in advance, to form bag Characteristic set containing default dimension examination information corresponding with various types of resources.Wherein, above-mentioned dimension examination information refers to Show the two or more dimension for carrying out resource examination, so as to during follow-up resource examination, the spy based on respective dimensions Reference breath carries out resource examination.Wherein, characteristic information refers to the value of respective dimensions.
For example, for video resource, can pre-set dimension examination information corresponding with video resource is included such as Lower dimension:Video stream encryption, video size and video title.The process of resource examination is subsequently then being carried out for video resource In, resource examination can be carried out based on the characteristic information of video stream encryption, video size and video title these three dimensions.Example again Such as, for textual resources, can pre-set dimension examination information corresponding with textual resources includes following dimension:Article is plucked Will, article content and article title.Then during subsequently resource examination is carried out for video resource, it can be plucked based on article Will, the characteristic information of article content and article title these three dimensions carry out resource examination.Certainly, on different types of resource Corresponding dimension examination information can be configured according to the actual requirements, be not construed as limiting herein.
The dimension examination information that step 103, basis currently obtain, based on current resource to be extracted in resource collection The characteristic information of respective dimensions carries out resource examination;
Wherein, there is the characteristic information of two or more different dimensions in each resource, and above-mentioned resource collection includes:Extract Resource.
In step 103, the dimension examination information obtained according to step 102, it is based on currently treating in Current resource set The characteristic informations of the respective dimensions of the resource of extraction carries out resource examination, with determine whether to determine in Current resource set with currently Resource identical resource to be extracted.
In a kind of application scenarios, above-mentioned dimension examination information also indicates:To carry out the two or more of resource examination dimension The examination order of degree.Then in step 103, the dimension examination information that can be obtained according to step 102, in Current resource set The characteristic information of respective dimensions successively based on current resource to be extracted carries out resource examination, until the result of resource examination is Sky, or, until completing the examination of all dimensions based on dimension examination information instruction.For example, set current to be extracted The type of resource be video resource, the dimension filter information that step 102 obtains includes video stream encryption, video size and video These three dimensions of title, and the examination order of dimension filter information instruction is followed successively by:Video title, video size and video flowing Coding.In step 103, can be in Current resource set, the feature of the video title first based on current resource to be extracted Information (namely value of video title) carries out resource examination, to filter out video title from Current resource set with currently waiting to carry The resource taken is identical or approximate resource (for ease of statement, subsequently collects the resource description that this is filtered out for screening first Close);If screening set first is not sky, in screening set first, the video size based on current resource to be extracted Characteristic information (namely value of video size) carries out resource examination, with filtered out from screening set first video size with it is current Resource identical resource to be extracted (for ease of statement, the resource description for subsequently filtering out this is postsearch screening set); If postsearch screening set is not sky, in postsearch screening set, the spy of the video stream encryption based on current resource to be extracted Reference breath (namely value of video stream encryption) carries out resource examination, to filter out video stream encryption from postsearch screening set with working as Preceding resource identical resource to be extracted (for ease of statement, subsequently collects the resource description that this is filtered out for screening three times Close).Now, due to having completed the examination of all dimensions based on dimension examination information instruction, therefore set is screened three times and is The final result of resource examination in this step 103.
, can be according to the dimension examination information of step 102 acquisition, by current money to be extracted in another application scenarios The characteristic information of the respective dimensions in source carries out Similarity Measure with the resource in Current resource set, obtains in Current resource set Each resource each dimension similarity score;For each resource in Current resource set, by each of same resource The similarity score of individual dimension adds up, and obtains the similarity total score of each resource in Current resource set;If similarity be present Total score is not less than the resource of default total score threshold value, then it is believed that the resource is identical with current resource to be extracted, can now incite somebody to action Similarity total score is included in this step 103 in the final result of resource examination not less than the resource of default total score threshold value.Lift Example explanation, if current resource to be extracted is resource A, and resource A type is video resource, the dimension sieve that step 102 obtains Information is selected to include video flowing coding, video size and video title these three dimensions.By taking the resource B in Current resource set as an example Illustrate, in step 103, the feature of the characteristic information of resource A video title and resource B video title can be believed Breath carries out Similarity Measure, and the similarity score for obtaining resource B in this dimension of video title (is subsequently retouched the similarity score State as B1), the characteristic information of resource A video size and the characteristic information of resource B video size are subjected to Similarity Measure, Similarity scores (subsequently the similarity score is described as B2) of the resource B in this dimension of video size is obtained, by resource A's The characteristic information of the characteristic information of video stream encryption and resource B video stream encryption carries out Similarity Measure, obtain resource B regarding The similarity score (similarity score subsequently is described as into B3) of this dimension of frequency stream encryption, afterwards adds up B1, B2 and B3, Resource B similarity total score is obtained, if resource B similarity total score is not less than default total score threshold value, resource B is included in this In secondary step 103 in the final result of resource examination, on other resource examinations that can refer to resource B in Current resource set Mode is handled, and here is omitted.Specifically, above-mentioned total score threshold value can be set according to the actual requirements.
If step 104, the result of above-mentioned resource examination are sky, by current Resource Access to be extracted to above-mentioned resource set In conjunction.
After resource examination by step 103, if the final result of resource examination is sky, it is believed that current to be extracted Resource be not extracted by, now can be by current Resource Access to be extracted into above-mentioned resource collection.In addition, by step After 103 resource examination, if the final result of resource examination is not sky, it is believed that current resource to be extracted has been extracted Cross, can now abandon current resource to be extracted, i.e. not by current Resource Access to be extracted into above-mentioned resource collection, or Person, if the final result of resource examination, not for sky, whether the resource that user can also be prompted artificially to determine that institute's examination goes out is carried Take, be not construed as limiting herein.
It should be noted that the Resource Access method in the embodiment of the present invention can be realized by Resource Access device, the money Source extraction element can be specifically integrated in the electronic equipments such as server, personal computer, be not construed as limiting herein.
Therefore different types of resource dimension examination information corresponding to structure in advance is directed in the embodiment of the present invention, Characteristic set is formed, for resource to be extracted, dimension examination corresponding with the type of the resource is obtained from this feature set Information, and the dimension examination information is based on, the feature letter of the respective dimensions based on current resource to be extracted in resource collection Breath carries out resource examination.On the one hand, due to being indicated in dimension examination information to carry out the two or more dimension of resource examination, phase For traditional scheme that different resource is only distinguished with title, the accuracy for distinguishing different resource is improved, reduces and repeats to extract The probability of same asset;On the other hand, because dimension filter information is related to the type of resource, this to carry out resource sieve The specific aim for the dimension looked into is stronger, so as to improve the validity of resource examination.
Embodiment two
The difference of the embodiment of the present invention and embodiment one is that the embodiment of the present invention carries out resource sieve according to examination order Look into, rational examination order can improve the efficiency of resource examination.Specifically, referring to Fig. 2, resource in the embodiment of the present invention Extracting method includes:
Step 201, the type for determining current resource to be extracted;
, can be during web crawlers crawls resource, for each resource to be extracted, automatically in the embodiment of the present invention Perform Resource Access method as shown in Figure 2.Or can also be under other Resource Access scenes, for each to be extracted Resource, Resource Access method as shown in Figure 2 is performed automatically, is not construed as limiting herein.
In step 201, for current resource to be extracted, the type of the resource is determined.In the embodiment of the present invention, resource Type include but is not limited to it is following several:Audio resource, video resource, textual resources, picture resource etc..Specifically, can be with The type of current resource to be extracted is determined according to the suffix name of resource, or, it can also determine currently to treat otherwise The type of the resource of extraction, for example, according to the source of current resource to be extracted determine current resource to be extracted type (if Source resource is in audio database, it is determined that the type of current resource to be extracted is audio resource, if source resource is in picture Database, it is determined that the type of current resource to be extracted is picture resource, by that analogy), not pair determination of the embodiment of the present invention The specific implementation of the type of current resource to be extracted is defined.
Step 202, dimension examination information corresponding with the above-mentioned type is obtained from default characteristic set;
In the embodiment of the present invention, corresponding dimension examination information is set for various types of resources in advance, to form bag Characteristic set containing default dimension examination information corresponding with various types of resources.Wherein, above-mentioned dimension examination information refers to Show the two or more dimension for carrying out resource examination, so as to during follow-up resource examination, the spy based on respective dimensions Reference breath carries out resource examination.Wherein, characteristic information refers to the value of respective dimensions.
Specifically, step 202 is referred to the description of step 102 in embodiment illustrated in fig. 1, and here is omitted.
The above-mentioned dimension examination information that step 203, basis currently obtain, successively based on current to be extracted in resource collection Resource respective dimensions characteristic information carry out resource examination;
Wherein, there is the characteristic information of two or more different dimensions in each resource, and above-mentioned resource collection includes:Extract Resource.
In the embodiment of the present invention, above-mentioned dimension examination information is tieed up except indicating to carry out the two or more of resource examination Degree, also indicate to carry out the examination order of the two or more dimension of resource examination.In step 203, can be according to step 202 The dimension examination information of acquisition, the feature letter of the respective dimensions in Current resource set successively based on current resource to be extracted Breath carries out resource examination, until the result of resource examination is sky, or, until completing the institute based on dimension examination information instruction There is the examination of dimension.
Optionally, the dimension examination information that above-mentioned basis currently obtains, it is based on currently treating successively in resource collection The characteristic information of the respective dimensions of the resource of extraction carries out resource examination, including:, will be per base during above-mentioned resource examination Caching is stored in the resource that the characteristic information examination of a dimension goes out;According to above-mentioned dimension examination information, in the money of newest caching The characteristic information of next dimension based on current resource to be extracted in source carries out resource examination, until the result of resource examination For sky, or, until based on the examination for completing all dimensions based on dimension examination information instruction.Work as example, setting The type of preceding resource to be extracted is video resource, and the dimension filter information that step 202 obtains is big including video stream encryption, video These three dimensions of small and video title, and the examination order of dimension filter information instruction is followed successively by:Video title, video size With video stream encryption.In step 203, can be in Current resource set, the video mark first based on current resource to be extracted The characteristic information (namely value of video title) of topic carries out resource examination, with filtered out from Current resource set video title with Current resource to be extracted is identical or approximate resource (for ease of statement, the resource description for subsequently filtering out this is first Screening set);If screening set first is not sky, by the resource deposit caching in screening set first, based on currently waiting to carry (i.e. screening collects the characteristic information (namely value of video size) of the video size of the resource taken first in the resource of newest caching In conjunction) resource examination is carried out, video size and current resource identical to be extracted are filtered out in set to be screened first from this Resource (for ease of statement, the resource description for subsequently filtering out this is postsearch screening set);If postsearch screening set is not Sky, then the resource deposit in postsearch screening set is buffered in postsearch screening set, based on regarding for current resource to be extracted The characteristic information (namely value of video stream encryption) of frequency stream encryption enters in the resource of newest caching (i.e. in postsearch screening set) Row resource examination, with filtered out from postsearch screening set video stream encryption and current resource identical resource to be extracted (for It is easy to state, subsequently gathers the resource description that this is filtered out for screening three times).Now, due to having completed to be based on the dimension The examination of all dimensions of examination information instruction, therefore screening set is the most termination of resource examination in this step 103 three times Fruit.
Certainly, in the embodiment of the present invention, the middle money filtered out can not also otherwise be kept in a manner of caching Source (the i.e. above-mentioned set of screening first or above-mentioned postsearch screening set), is not construed as limiting herein.
Different examination orders should be possessed in view of different type resource, and rational examination order can be reduced effectively Duration and system loss spent by examination.Therefore, further, each resource examination can also be recorded in the embodiment of the present invention Spent duration and/or system loss can be recorded, when spent duration and/or system loss are not less than default Threshold value or continuous n times not small default threshold value when (N be it is default be more than 1 positive integer), it is meant that the resource correspond to dimension and sieved The examination order looked into information is undesirable, and examination in dimension examination information can now correspond to the adjust automatically resource sequentially, Until finding a set of rational examination order.Specifically, the Resource Access method in the embodiment of the present invention can also include:When complete Into during this resource examination, the duration spent by this resource examination is obtained;If the duration spent by this resource examination is not small In default time threshold, then dimension corresponding with the type of above-mentioned current resource to be extracted is updated in features described above set Examination order in examination information.And/or when completing this resource examination, obtain the system spent by this resource examination Loss;If the system loss spent by this resource examination is not less than default loss threshold value, in features described above set more The newly examination order in dimension examination information corresponding with the type of the current resource to be extracted.
In addition, a setting interface can also be further provided in the embodiment of the present invention, can be by upper so as to obtain user State the examination order in the dimension examination information and dimension examination information for setting interface to adjust different types of resource.
If step 204, the result of above-mentioned resource examination are sky, by current Resource Access to be extracted to above-mentioned resource set In conjunction;
After resource examination by step 203, if the final result of resource examination is sky, it is believed that current to be extracted Resource be not extracted by, now can be by current Resource Access to be extracted into above-mentioned resource collection.In addition, by step After 203 resource examination, if the final result of resource examination is not sky, it is believed that current resource to be extracted has been extracted Cross, can now abandon current resource to be extracted, i.e. not by current Resource Access to be extracted into above-mentioned resource collection, or Person, if the final result of resource examination, not for sky, whether the resource that user can also be prompted artificially to determine that institute's examination goes out is carried Take, be not construed as limiting herein.
It should be noted that the Resource Access method in the embodiment of the present invention can be realized by Resource Access device, the money Source extraction element can be specifically integrated in the electronic equipments such as server, personal computer, be not construed as limiting herein.
Therefore different types of resource dimension examination information corresponding to structure in advance is directed in the embodiment of the present invention, Characteristic set is formed, for resource to be extracted, dimension examination corresponding with the type of the resource is obtained from this feature set Information, and the dimension examination information is based on, the feature letter of the respective dimensions based on current resource to be extracted in resource collection Breath carries out resource examination.On the one hand, due to being indicated in dimension examination information to carry out the two or more dimension of resource examination, phase For traditional scheme that different resource is only distinguished with title, the accuracy for distinguishing different resource is improved, reduces and repeats to extract The probability of same asset;On the other hand, because dimension filter information is related to the type of resource, this to carry out resource sieve The specific aim for the dimension looked into is stronger, so as to improve the validity of resource examination.In addition, the embodiment of the present invention also provides a kind of root The scheme of resource examination is carried out according to examination order, the efficiency of resource examination can be improved by setting rational screening order.
Embodiment three
Corresponding to the Resource Access method described in embodiment one or embodiment two, Fig. 3 shows that the embodiment of the present invention provides The Resource Access program related to above-mentioned Resource Access method running environment schematic diagram, for convenience of description, illustrate only The part related to the embodiment of the present invention.
In embodiments of the present invention, above-mentioned Resource Access program is installed and run in electronic equipment.The electronic equipment can Include but are not limited to one or more memories 31 (one is only shown in figure), one or more processors 32 (are only shown in figure Go out one), above-mentioned memory 31 and processor 32 are connected by bus 33.Fig. 3 illustrate only the electronics with component 31-33 and set It is standby, it should be understood that be not required for implementing all components shown, the more components of implementation that can substitute (such as it is aobvious Show device etc.) or less component.
Memory 31 can be the internal storage unit of electronic equipment in certain embodiments, such as the electronic equipment is hard Disk or internal memory.Memory 31 can also be the External memory equipment of electronic equipment in further embodiments, such as electronic equipment The plug-in type hard disk of upper outfit, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card) etc..Further, memory 31 can also both include the storage inside list of electronic equipment Member also includes External memory equipment.Memory 31 is used to store the application software and Various types of data for being installed on electronic equipment, such as Program code of above-mentioned Resource Access program etc..Memory 31, which can be also used for temporarily storing, have been exported or will export Data.
Processor 32 can be in certain embodiments a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chips, for the program code or processing data stored in run memory 31, example Such as perform above-mentioned Resource Access program.
Further, on the basis of Fig. 3, referring to Fig. 4, Fig. 4 for it is provided in an embodiment of the present invention correspond to embodiment one or The functional block diagram of the Resource Access program of the above-mentioned Resource Access method of embodiment two.In embodiments of the present invention, above-mentioned money Source extraction procedure can be divided into one or more modules, and said one or multiple modules are stored in memory 31, And it is performed by one or more processors (the present embodiment is processor 32), to complete the present invention.For example, state in fig. 4, the upper Resource Access program can be divided into determining module 41, acquisition module 42, resource examination module 43 and extraction module 44.This hair Bright alleged module is the series of computation machine programmed instruction section for referring to complete specific function, than program more suitable in description State implementation procedure of the Resource Access program in above-mentioned electronic equipment.Describe to introduce determining module 41, acquisition mould by specific below The function of block 42, resource examination module 43 and extraction module 44.
Determining module 41, for determining the type of current resource to be extracted;
Acquisition module 42, for obtaining dimension examination information corresponding with the type from default characteristic set, its In, the dimension examination information instruction includes to carry out the two or more dimension of resource examination in the characteristic set:It is default Dimension examination information corresponding with various types of resources;
Resource examination module 43, for the dimension examination information obtained according to current acquisition module 42, in resource collection The characteristic information of respective dimensions based on current resource to be extracted carries out resource examination, wherein, each resource exist two with The characteristic information of upper different dimensions, the resource collection include:The resource extracted;
Extraction module 44, the result of the resource examination for being obtained when resource examination module 43 is space-time, currently will wait to carry The Resource Access taken is into the resource collection.
Optionally, the dimension examination information also indicates:It is suitable to carry out the examination of the two or more dimension of resource examination Sequence;Resource examination module 43 is specifically used for:The dimension examination information obtained according to current acquisition module 42, in resource collection In the characteristic informations of respective dimensions successively based on current resource to be extracted carry out resource examination.Further, resource examination mould Block 43 is specifically used for:During the resource examination, the resource that the characteristic information examination often based on a dimension goes out is deposited Enter caching;According to the dimension examination information, next dimension based on current resource to be extracted in the resource of newest caching The characteristic information progress resource examination of degree, until the result of resource examination is sky, or, until being based on completing to be based on the dimension The examination of all dimensions of examination information instruction.
Optionally, above-mentioned Resource Access program also it is divisible go out:First logging modle, for when resource examination module 43 it is complete Into during this resource examination, the duration spent by this resource examination is obtained;First update module, for when this resource examination When spent duration is not less than default time threshold, renewal and the current resource to be extracted in the characteristic set Type corresponding to examination order in dimension examination information.
Optionally, above-mentioned Resource Access program also it is divisible go out:Second logging modle, for when resource examination module 43 it is complete Into during this resource examination, the system loss spent by this resource examination is obtained;Second update module, for when this resource When duration spent by examination is not less than default loss threshold value, updated in the characteristic set and described current to be extracted Examination order in dimension examination information corresponding to the type of resource.
It should be understood that the electronic equipment in the embodiment of the present invention can be used for realizing whole technologies in above method embodiment Scheme, the part for not being described in detail and referring in embodiments of the present invention, the description of above method embodiment is may refer to, herein not Repeat again.
Therefore different types of resource dimension examination information corresponding to structure in advance is directed in the embodiment of the present invention, Characteristic set is formed, for resource to be extracted, dimension examination corresponding with the type of the resource is obtained from this feature set Information, and the dimension examination information is based on, the feature letter of the respective dimensions based on current resource to be extracted in resource collection Breath carries out resource examination.On the one hand, due to being indicated in dimension examination information to carry out the two or more dimension of resource examination, phase For traditional scheme that different resource is only distinguished with title, the accuracy for distinguishing different resource is improved, reduces and repeats to extract The probability of same asset;On the other hand, because dimension filter information is related to the type of resource, this to carry out resource sieve The specific aim for the dimension looked into is stronger, so as to improve the validity of resource examination.
For convenience of description and succinctly, only carried out with the division of above-mentioned each functional unit, module for example, actual should In, it can be completed as needed and by above-mentioned function distribution by different functional units, module, i.e., by the inside of described device Structure is divided into different functional units or module, to complete all or part of function described above.It is each in embodiment Functional unit, module can be integrated in a processing unit or unit is individually physically present, can also be two Or two or more unit is integrated in a unit, above-mentioned integrated unit can both be realized in the form of hardware, can also Realized in the form of SFU software functional unit.In addition, the specific name of each functional unit, module is also only to facilitate mutual area Divide, be not limited to the protection domain of the application.Unit, the specific work process of module in said system, before may be referred to The corresponding process in embodiment of the method is stated, will not be repeated here.
Those of ordinary skill in the art are it is to be appreciated that the list of each example described with reference to the embodiments described herein Member and algorithm steps, it can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually Performed with hardware or software mode, application-specific and design constraint depending on technical scheme.Professional and technical personnel Described function can be realized using distinct methods to each specific application, but this realization is it is not considered that exceed The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed apparatus and method, others can be passed through Mode is realized.For example, system embodiment described above is only schematical, for example, the division of the module or unit, Only a kind of division of logic function, can there is an other dividing mode when actually realizing, such as multiple units or component can be with With reference to or be desirably integrated into another system, or some features can be ignored, or not perform.It is another, it is shown or discussed Mutual coupling or direct-coupling or communication connection can be by some interfaces, the INDIRECT COUPLING of device or unit or Communication connection, can be electrical, mechanical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.
In addition, each functional module in each embodiment of the present invention can be integrated in a processing module, can also That modules are individually physically present, can also two or more modules be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.
If the integrated module is realized in the form of software function module and is used as independent production marketing or use When, it can be stored in a computer read/write memory medium.Based on such understanding, the technical scheme of the embodiment of the present invention The part substantially to be contributed in other words to prior art or all or part of the technical scheme can be with software products Form embody, the computer software product is stored in a storage medium, including some instructions are causing one Computer equipment (can be personal computer, server, or network equipment etc.) or processor (processor) perform this hair The all or part of step of the bright each embodiment methods described of embodiment.And foregoing storage medium includes:USB flash disk, mobile hard disk, Read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic Dish or CD etc. are various can be with the medium of store program codes.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although with reference to foregoing reality Example is applied the present invention is described in detail, it will be understood by those within the art that:It still can be to foregoing each Technical scheme described in embodiment is modified, or carries out equivalent substitution to which part technical characteristic;And these are changed Or replace, the essence of appropriate technical solution is departed from the spirit and scope of various embodiments of the present invention technical scheme, all should Within protection scope of the present invention.

Claims (10)

  1. A kind of 1. Resource Access method, it is characterised in that including:
    It is determined that the type of current resource to be extracted;
    Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination information Instruction includes to carry out the two or more dimension of resource examination in the characteristic set:Default and various types of resources Corresponding dimension examination information;
    According to the dimension examination information currently obtained, the respective dimensions based on current resource to be extracted in resource collection Characteristic information carry out resource examination, wherein, the characteristic information of two or more different dimensions, the resource set be present in each resource Conjunction includes:The resource extracted;
    If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.
  2. 2. Resource Access method according to claim 1, it is characterised in that the dimension examination information also indicates:To Carry out the examination order of the two or more dimension of resource examination;
    The dimension examination information that the basis currently obtains, based on the corresponding of current resource to be extracted in resource collection The characteristic information of dimension carries out resource examination, is:According to the dimension examination information currently obtained, in resource collection successively The characteristic information of respective dimensions based on current resource to be extracted carries out resource examination.
  3. 3. Resource Access method according to claim 1, it is characterised in that the dimension sieve that the basis currently obtains Information is looked into, the characteristic information of the respective dimensions in resource collection successively based on current resource to be extracted carries out resource examination, Including:
    During the resource examination, the resource that the characteristic information examination often based on a dimension is gone out is stored in caching;
    According to the dimension examination information, next dimension based on current resource to be extracted in the resource of newest caching Characteristic information progress resource examination, until the result of resource examination is sky, or, until being based on completing to be based on the dimension examination The examination of all dimensions of information instruction.
  4. 4. the Resource Access method according to Claims 2 or 3, it is characterised in that the Resource Access method also includes:
    When completing this resource examination, the duration spent by this resource examination is obtained;
    If the duration spent by this resource examination is not less than default time threshold, renewal and institute in the characteristic set State the examination order in dimension examination information corresponding to the type of current resource to be extracted.
  5. 5. the Resource Access method according to Claims 2 or 3, it is characterised in that the Resource Access method also includes:
    When completing this resource examination, the system loss spent by this resource examination is obtained;
    If the system loss spent by this resource examination is not less than default loss threshold value, updated in the characteristic set Examination order in dimension examination information corresponding with the type of the current resource to be extracted.
  6. 6. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists In, when the computer program is by least one computing device, the step of realization method as any one of claim 1-5 Suddenly.
  7. 7. a kind of electronic equipment, it is characterised in that the electronic equipment includes memory, processor and is stored in the memory Computer program that is upper and can running on the processor, realizes following step during computer program described in the computing device Suddenly:
    It is determined that the type of current resource to be extracted;
    Dimension examination information corresponding with the type is obtained from default characteristic set, wherein, the dimension examination information Instruction includes to carry out the two or more dimension of resource examination in the characteristic set:Default and various types of resources Corresponding dimension examination information,
    According to the dimension examination information currently obtained, the respective dimensions based on current resource to be extracted in resource collection Characteristic information carry out resource examination, wherein, the characteristic information of two or more different dimensions, the resource set be present in each resource Conjunction includes:The resource extracted;
    If the result of the resource examination is sky, by current Resource Access to be extracted into the resource collection.
  8. 8. electronic equipment according to claim 7, it is characterised in that the dimension examination information also indicates:To carry out The examination order of the two or more dimension of resource examination;
    The dimension examination information that the basis currently obtains, based on the corresponding of current resource to be extracted in resource collection The characteristic information of dimension carries out resource examination, is:According to the dimension examination information currently obtained, in resource collection successively The characteristic information of respective dimensions based on current resource to be extracted carries out resource examination.
  9. 9. electronic equipment according to claim 8, it is characterised in that the dimension examination letter that the basis currently obtains Breath, the characteristic information of the respective dimensions in resource collection successively based on current resource to be extracted carry out resource examination, including:
    During the resource examination, the resource that the characteristic information examination often based on a dimension is gone out is stored in caching;
    According to the dimension examination information, next dimension based on current resource to be extracted in the resource of newest caching Characteristic information progress resource examination, until the result of resource examination is sky, or, until being based on completing to be based on the dimension examination The examination of all dimensions of information instruction.
  10. 10. electronic equipment according to claim 8 or claim 9, it is characterised in that computer program described in the computing device When also realize following steps:
    When completing resource examination, the duration spent by this resource examination is obtained;
    If the duration spent by this resource examination is not less than default time threshold, renewal and institute in the characteristic set State the examination order in dimension examination information corresponding to the type of current resource to be extracted.
CN201710605505.4A 2017-07-24 2017-07-24 Resource Access method, computer-readable recording medium and electronic equipment Pending CN107729343A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201710605505.4A CN107729343A (en) 2017-07-24 2017-07-24 Resource Access method, computer-readable recording medium and electronic equipment
PCT/CN2018/076524 WO2019019619A1 (en) 2017-07-24 2018-02-12 Resource extraction method, computer readable storage medium, and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710605505.4A CN107729343A (en) 2017-07-24 2017-07-24 Resource Access method, computer-readable recording medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN107729343A true CN107729343A (en) 2018-02-23

Family

ID=61201711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710605505.4A Pending CN107729343A (en) 2017-07-24 2017-07-24 Resource Access method, computer-readable recording medium and electronic equipment

Country Status (2)

Country Link
CN (1) CN107729343A (en)
WO (1) WO2019019619A1 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100235333A1 (en) * 2009-03-16 2010-09-16 International Business Machines Corporation Apparatus and method to sequentially deduplicate data
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
CN103024078A (en) * 2012-12-31 2013-04-03 无锡城市云计算中心有限公司 Resource allocation method and device in cloud computing environment
CN103036697A (en) * 2011-10-08 2013-04-10 阿里巴巴集团控股有限公司 Multi-dimensional data duplicate removal method and system
CN104750724A (en) * 2013-12-30 2015-07-01 亿阳信通股份有限公司 Information filtering method and information filtering device
CN105574004A (en) * 2014-10-10 2016-05-11 阿里巴巴集团控股有限公司 Webpage deduplication method and device
CN105630802A (en) * 2014-10-30 2016-06-01 阿里巴巴集团控股有限公司 Webpage duplication removal method and apparatus
CN105989022A (en) * 2015-01-30 2016-10-05 北京陌陌信息技术有限公司 Method and system for eliminating repetition of data
CN106126634A (en) * 2016-06-22 2016-11-16 武汉斗鱼网络科技有限公司 A kind of master data duplicate removal treatment method based on live industry and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567313B (en) * 2010-12-07 2016-08-24 盛乐信息技术(上海)有限公司 Progressive webpage library deduplication system and its implementation

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100235333A1 (en) * 2009-03-16 2010-09-16 International Business Machines Corporation Apparatus and method to sequentially deduplicate data
CN102023977A (en) * 2009-09-21 2011-04-20 陈俊 Data filtering method and data filtering system and application thereof
CN103036697A (en) * 2011-10-08 2013-04-10 阿里巴巴集团控股有限公司 Multi-dimensional data duplicate removal method and system
CN103024078A (en) * 2012-12-31 2013-04-03 无锡城市云计算中心有限公司 Resource allocation method and device in cloud computing environment
CN104750724A (en) * 2013-12-30 2015-07-01 亿阳信通股份有限公司 Information filtering method and information filtering device
CN105574004A (en) * 2014-10-10 2016-05-11 阿里巴巴集团控股有限公司 Webpage deduplication method and device
CN105630802A (en) * 2014-10-30 2016-06-01 阿里巴巴集团控股有限公司 Webpage duplication removal method and apparatus
CN105989022A (en) * 2015-01-30 2016-10-05 北京陌陌信息技术有限公司 Method and system for eliminating repetition of data
CN106126634A (en) * 2016-06-22 2016-11-16 武汉斗鱼网络科技有限公司 A kind of master data duplicate removal treatment method based on live industry and system

Also Published As

Publication number Publication date
WO2019019619A1 (en) 2019-01-31

Similar Documents

Publication Publication Date Title
CN108537292A (en) Semantic segmentation network training method, image, semantic dividing method and device
CN111190939B (en) User portrait construction method and device
CN107423613B (en) Method and device for determining device fingerprint according to similarity and server
CN108170750A (en) A kind of face database update method, system and terminal device
CN109118296A (en) Movable method for pushing, device and electronic equipment
CN110992167A (en) Bank client business intention identification method and device
CN109242553A (en) A kind of user behavior data recommended method, server and computer-readable medium
CN103605714B (en) The recognition methods of website abnormal data and device
CN106980497A (en) Webpage and website performance optimization method and device
CN109740019A (en) A kind of method, apparatus to label to short-sighted frequency and electronic equipment
CN107688591A (en) A kind of actuarial treating method and apparatus
CN107808306A (en) Cutting method, electronic installation and the storage medium of business object based on tag library
CN106844685A (en) Method, device and server for recognizing website
CN107491674A (en) Feature based information carries out the method and device of user's checking
CN110609908A (en) Case serial-parallel method and device
CN109766470A (en) Image search method, device and processing equipment
CN107368526A (en) A kind of data processing method and device
CN109784394A (en) A kind of recognition methods, system and the terminal device of reproduction image
CN107657030A (en) Collect method, apparatus, terminal device and storage medium that user reads data
CN113392303A (en) Background blasting method, device, equipment and computer readable storage medium
CN108197795A (en) The account recognition methods of malice group, device, terminal and storage medium
CN108182595A (en) A kind of formulation migration efficiency method and device
CN107193870A (en) The extracting method and system of web page contents
CN103336800A (en) Fingerprint storage and comparison method based on behavior analysis
CN108959289A (en) Categories of websites acquisition methods and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180607

Address after: 518000 Room 201, building A, 1 front Bay Road, Shenzhen Qianhai cooperation zone, Shenzhen, Guangdong

Applicant after: Shenzhen one ledger Intelligent Technology Co., Ltd.

Address before: 200000 Xuhui District, Shanghai Kai Bin Road 166, 9, 10 level.

Applicant before: Shanghai Financial Technologies Ltd

TA01 Transfer of patent application right
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1250067

Country of ref document: HK

RJ01 Rejection of invention patent application after publication

Application publication date: 20180223

RJ01 Rejection of invention patent application after publication