CN106326445B - A kind of web page contents evaluation method based on heat transfer agent amount - Google Patents

A kind of web page contents evaluation method based on heat transfer agent amount Download PDF

Info

Publication number
CN106326445B
CN106326445B CN201610737560.4A CN201610737560A CN106326445B CN 106326445 B CN106326445 B CN 106326445B CN 201610737560 A CN201610737560 A CN 201610737560A CN 106326445 B CN106326445 B CN 106326445B
Authority
CN
China
Prior art keywords
information
webpage
sensing information
block
sensing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610737560.4A
Other languages
Chinese (zh)
Other versions
CN106326445A (en
Inventor
李德识
刘鸣柳
陈健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201610737560.4A priority Critical patent/CN106326445B/en
Publication of CN106326445A publication Critical patent/CN106326445A/en
Application granted granted Critical
Publication of CN106326445B publication Critical patent/CN106326445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The purpose of the present invention is to provide a kind of web page contents evaluation methods based on heat transfer agent amount.Web page contents are analyzed first, according to the content of web data block and its whether there is spatial description characteristic and time behavior, whether be that sensor information block judges to it;Then, according to the size of heat transfer agent block, distributing position situation successively calculates the visual information amount of all the sensors block of information;Using the error image of webpage, the renewal time interval of heat transfer agent block is calculated, and the time information amount of block of information is obtained by renewal frequency;The visual information amount of all heat transfer agent blocks is successively multiplied with time information amount and is added up to get the heat transfer agent amount of web page contents is arrived.The present invention may be implemented to evaluate the webpage for heat transfer agent content, and evaluation result will provide research support for the equipment of Internet of Things and data search.

Description

webpage content evaluation method based on sensing information quantity
Technical Field
The invention belongs to the field of information search of the Internet of things, and particularly relates to a webpage content evaluation method based on sensing information quantity.
Background
With the popularization and development of the internet of things technology, the number of sensors is increasing day by day, and in the face of mass data which is continuously generated, the research of the internet of things searching technology becomes a hot problem to be solved urgently at present. Compared with data in the traditional internet, the data generated by the sensor has three-dimensional properties of content, time and space. Meanwhile, in consideration of privacy security, transmission load and other problems, a great number of sensors transmit data to the internet at present, and display data contents in a webpage form for users to freely access.
The webpage shows the data content collected by the sensor to the user through various modes such as videos, pictures, tables and curves, so that how to evaluate the size of the sensing data quantity contained in the webpage has extremely high value for the research of webpage sensing information search. At present, most of relevant work of webpage evaluation research depends on visual characteristics, link contents and the like as reference conditions, indexes such as complexity and reliability of a webpage are investigated, the content analysis result aiming at sensing information in the webpage is less, and the webpage content is simply analyzed according to the visual characteristics, the link and the like without representativeness and pertinence.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a webpage evaluation method based on sensing information quantity, and aims to provide data support for a webpage sensing information search technology in the search of the Internet of things according to the sensing information quantity in a webpage.
The purpose of the invention is realized as follows:
(1) the sensing data has more attributes in the temporal and spatial dimensions than the contents of other data blocks. The judgment of the webpage sensing information block starts from the two conditions, and the webpage content with both time variation characteristics and space description characteristics is called sensing information.
(2) Since the size of the data space information amount cannot be expressed, the analysis of the space description characteristics is only used to assist in judging whether the webpage data block is a sensing information block. From the analysis of the time variation characteristics, the concept of the time information amount is defined for the webpage data block, and generally speaking, the time information amount of the sensing information block in the timeliness is more than 0. Meanwhile, from the perspective of user experience, a concept of visual information amount is defined for each webpage data block. For each webpage data block, the sensing information quantity is characterized by the product of the time information quantity and the visual information quantity. And for each webpage, the sensing information amount is characterized by the sum of the sensing information amounts of all the sensing information blocks.
(3) According to research findings, users often present different visual attention to contents in different areas of a webpage. According to the difference of the visual attention of the user, a visual evaluation method of the position weight is provided aiming at the distribution situation of different data contents in the webpage. The connecting line from the upper left corner to the lower right corner is taken as a boundary, the closer the position of the webpage data block is to the boundary, the higher the attention degree of the user is, and the higher the position weight of the data block is, and meanwhile, the attention point of the user to the content is gradually dispersed from the upper left corner to the middle along with the time lapse, so that the position weight is in inverse proportion to the distance from the data block to the upper left vertex of the webpage.
(4) Aiming at the sizes of visual areas of different data blocks of a webpage, an effective area is defined to represent the visual information quantity of each webpage data block in combination with a concept of position weight, and the visual information quantity is respectively in direct proportion to the area and the position weight of the webpage data block.
(5) Sensor data has a temporal attribute, and it is clear that data with higher real-time has higher query value. Considering the update frequency of the webpage data blocks, the sensing data blocks with high frequency have larger time information amount.
The invention is realized by adopting the following technical scheme:
a webpage content evaluation method based on sensing information quantity comprises the following steps:
step 1: counting the number of all sensing information blocks in the webpage;
step 2: sequentially calculating the visual information quantity of the sensing information blocks;
and step 3: obtaining the updating frequency of the sensing information block by using a method of combining semantic discrimination and difference picture analysis, and calculating the time information amount of the sensing information block;
and 4, step 4: and sequentially calculating the information quantity of each sensing information block, and accumulating to obtain the whole information quantity of the webpage.
The specific process of counting the number of all the sensing information blocks in the webpage in the step 1 is as follows:
the method comprises the steps that a webpage is divided to obtain different data block contents, whether the divided data blocks contain sensing information or not is judged through analyzing the semantics and the updating condition of the webpage contents, the data blocks containing the sensing information are defined as sensing data blocks, and the number of all the sensing information blocks is counted; let Φ denote the set of all blocks of sensory information in the web page.
In the step 2, the process of calculating the visual information amount of the sensing information block is as follows:
ABCD represents the whole webpage, EFGH represents the sensing information block b in the webpageiO and O' represent the web page and the information block b, respectivelyiR represents the distance of the AO,the distance of the AO' is represented,represents the included angle between AO and AO'; then, the web page information block biThe location weight of (a) is:
the effective area is used for representing the visual information amount of the webpage, so that the effective area of the webpage is as follows:
wherein,representing information blocks biArea of (S)pRepresenting the overall area of the web page.
In the step 3, the process of calculating the time information amount of the sensing information block is as follows:
firstly, preprocessing an acquired webpage source code of a sensing information block to obtain text content; extracting the updating information expressed by the webpage text by matching the updating template of the text content; acquiring a difference image according to the extracted updating information, realizing difference image result detection through analysis of image pixel values, if the difference image has pixel points which are not 0, successfully matching the updating information, otherwise, if the text extraction process finds that the updating information does not exist or the difference image detection result does not accord with the updating information, searching for updating frequency by using an image difference value;
setting a sensing information block biIs updated at a time interval ofTaking the appearance time t of the two first-appearing non-zero difference images1,t2Then, then
Thus, the sensing information block b is calculatediThe amount of time information of (a):
in the step 3, T is set to 86400s, which represents the time of day,representing the number of updates in a day, defining an update interval of the sensing information blocks asTherefore, there are:
in step 4, the process of calculating the whole information amount of the webpage is as follows:
for each sensing information block b in the webpageiDefining the sensing information quantity as:
therefore, for a complete web page, the sensing information amount of the web page is:
compared with the prior art, the invention has the advantages that:
the method comprises the steps that an evaluation method is provided for a webpage aiming at the inclusion condition of sensing information in the webpage for the first time; starting from the dimensional characteristics of the sensing information, and combining the consideration of the visual perception characteristics of the user, the situation that the sensing information quantity is used for measuring the sensing information represented in the webpage is provided; the traditional webpage evaluation means usually adopts artificial scoring or star-level evaluation, and the method quantifies each index of webpage information and gives an evaluation result through calculation. By the evaluation index designed by the invention, the webpages containing the sensing information with different quantities, different types, different visual intensities and different real-time degrees can be effectively distinguished, and reference basis can be provided for webpage ranking of the sensing information search result in the Internet of things in the future.
Drawings
Fig. 1 is a schematic illustration of two webpage sensing information blocks, wherein fig. 1(a) is a schematic illustration of a mixed sensing information webpage information block, and fig. 1(b) is a schematic illustration of a single-chart sensing information webpage information block;
FIG. 2 is a schematic spatial diagram of parameters of visual information in the present invention;
FIG. 3 is a flowchart of the calculation of the update frequency of web pages in the present invention;
FIG. 4 is a time analysis graph of the update frequency of the difference image calculation in the present invention.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail below with reference to the accompanying drawings and examples.
FIG. 2 rectangle ABCD represents the entire web page and EFGH represents the identified sensor information blocks. The main evaluation method of the invention comprises the following contents:
step 1: and counting the number of all sensing information blocks in the webpage.
Step 2: and sequentially calculating the visual information quantity of the sensing information blocks.
And step 3: and obtaining the updating frequency of the sensing information block by using a semantic discrimination or difference picture analysis method, and calculating the time information amount of the sensing information block.
And 4, step 4: and sequentially calculating the information quantity of each sensing information block, and accumulating to obtain the whole information quantity of the webpage.
The specific embodiment comprises the following steps:
1. counting the number of all sensing information blocks in the webpage
As shown in fig. 1a and fig. 1b, the web page is divided to obtain different data block contents shown in a thick-line rectangular frame, and by analyzing the semantics and the update condition of the web page contents, it can be determined that both the two web pages shown in the figure contain sensing information, and the sensing information blocks are indicated as marks in the figure, and two sensing data blocks are provided in the four data blocks in fig. 1a, and respectively show the video information and temperature information results of a certain area of Toronto collected by a camera and a temperature sensor; com from site, four blocks of sensing information are shown in fig. 1b, which respectively show soil monitoring conditions of a certain place uploaded by a user through autonomous registration, including information of thickness of gypsum contained in soil, system voltage and temperature.
Let Φ denote the set of all blocks of sensory information in the web page.
2. Calculation of the amount of visual information of an information block
As shown in FIG. 2, ABCD represents the whole web page, and EFGH represents a certain block b in the web pagei. O and O' represent the web page and the information block b respectivelyiThe center of mass of the lens. r represents the distance of the AO and,the distance of the AO' is represented,represents the angle between AO and AO'. Then, the web page information block biThe location weight of (a) is:
the effective area is used for representing the visual information amount of the webpage, and then the effective area of the webpage is as follows:
wherein,representing information blocks biOf (a) and SpRepresenting the overall area of the web page.
3. Obtaining time information quantity of sensing information block
As shown in fig. 3, in the extraction process of updating the frequency, the semantic analysis and the difference image analysis are combined, so as to improve the accuracy and efficiency of frequency extraction. Firstly, preprocessing the acquired webpage source code to obtain text content. And extracting the updating information expressed by the webpage text by matching the updating template of the text content. And obtaining a difference image according to the extracted updating information, realizing difference image result detection through analyzing the image pixel value, successfully matching the updating information if the difference image has pixel points which are not 0, and finding the updating frequency by using the image difference if the updating information does not exist or the difference image detection result is inconsistent with the updating information in the text extraction process.
As shown in FIG. 4, let data block biIs updated at a time interval ofAs the webpage interception may be started at any time point, the occurrence time t of the two first non-zero difference images is taken1,t2Then, then
Thus, the sensing information block b can be calculatediThe amount of time information of (a):
since most of the information blocks in the web page are updated many times in a day, and most of the sensor information is more effective in data content collected in the day, the timeliness of the data is defined to be one day at most. The default setting T86400 s represents the time of day, and thusIndicating the number of updates in a day. And the frame rate used by the current commonly used streaming media transmission protocol is usually in the range of 20-30 fbps for the video sensor data. Therefore, we define the update time interval of the video-like sensing information block asThus, we have
4. Calculating sensing information quantity of webpage
For each sensing information block b in the webpageiDefining the sensing information quantity as:
therefore, for a complete web page, the sensing information amount of the web page is:

Claims (2)

1. A webpage content evaluation method based on sensing information quantity is characterized in that; the method comprises the following steps:
step 1: counting the number of all sensing information blocks in the webpage;
step 2: sequentially calculating the visual information quantity of the sensing information blocks;
and step 3: obtaining the updating frequency of the sensing information block by using a method of combining semantic discrimination and difference picture analysis, and calculating the time information amount of the sensing information block;
and 4, step 4: sequentially calculating the information quantity of each sensing information block, and accumulating to obtain the whole information quantity of the webpage;
the specific process of counting the number of all the sensing information blocks in the webpage in the step 1 is as follows:
the method comprises the steps that a webpage is divided to obtain different data block contents, whether the divided data blocks contain sensing information or not is judged through analyzing the semantics and the updating condition of the webpage contents, the data blocks containing the sensing information are defined as sensing data blocks, and the number of all the sensing information blocks is counted; let Φ represent the set of all blocks of sensory information in the web page;
in the step 2, the process of calculating the visual information amount of the sensing information block is as follows:
ABCD represents the whole webpage, EFGH represents the sensing information block b in the webpageiO and O' represent the web page and the information block b, respectivelyiR represents the distance of the AO,the distance of the AO' is represented,represents the included angle between AO and AO'; then, the web page information block biThe location weight of (a) is:
the effective area is used for representing the visual information amount of the webpage, so that the effective area of the webpage is as follows:
wherein,representing information blocks biArea of (S)pRepresenting web pagesThe overall area of;
in the step 3, the process of calculating the time information amount of the sensing information block is as follows:
firstly, preprocessing an acquired webpage source code of a sensing information block to obtain text content; extracting the updating information expressed by the webpage text by matching the updating template of the text content; acquiring a difference image according to the extracted updating information, realizing difference image result detection through analysis of image pixel values, if the difference image has pixel points which are not 0, successfully matching the updating information, otherwise, if the text extraction process finds that the updating information does not exist or the difference image detection result does not accord with the updating information, searching for updating frequency by using an image difference value;
setting a sensing information block biIs updated at a time interval ofTaking the appearance time t of the two first-appearing non-zero difference images1,t2Then, then
Thus, the sensing information block b is calculatediThe amount of time information of (a):
in step 4, the process of calculating the whole information amount of the webpage is as follows:
for each sensing information block b in the webpageiDefining the sensing information quantity as:
therefore, for a complete web page, the sensing information amount of the web page is:
2. the web content evaluation method based on the sensing information amount according to claim 1, characterized in that; in the step 3, T is set to 86400s, which represents the time of day,representing the number of updates in a day, defining an update interval of the sensing information blocks asTherefore, there are:
CN201610737560.4A 2016-08-26 2016-08-26 A kind of web page contents evaluation method based on heat transfer agent amount Active CN106326445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610737560.4A CN106326445B (en) 2016-08-26 2016-08-26 A kind of web page contents evaluation method based on heat transfer agent amount

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610737560.4A CN106326445B (en) 2016-08-26 2016-08-26 A kind of web page contents evaluation method based on heat transfer agent amount

Publications (2)

Publication Number Publication Date
CN106326445A CN106326445A (en) 2017-01-11
CN106326445B true CN106326445B (en) 2019-09-17

Family

ID=57790935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610737560.4A Active CN106326445B (en) 2016-08-26 2016-08-26 A kind of web page contents evaluation method based on heat transfer agent amount

Country Status (1)

Country Link
CN (1) CN106326445B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944104A (en) * 2010-08-19 2011-01-12 百度在线网络技术(北京)有限公司 Evaluation method and equipment for importance of webpage sub-blocks
CN103020129A (en) * 2012-11-20 2013-04-03 中兴通讯股份有限公司 Text content extraction method and text content extraction device
CN103514234A (en) * 2012-06-30 2014-01-15 北京百度网讯科技有限公司 Method and device for extracting page information
CN103927365A (en) * 2014-04-21 2014-07-16 武汉大学 Web page time sensibility measurement method based on energy function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101944104A (en) * 2010-08-19 2011-01-12 百度在线网络技术(北京)有限公司 Evaluation method and equipment for importance of webpage sub-blocks
CN103514234A (en) * 2012-06-30 2014-01-15 北京百度网讯科技有限公司 Method and device for extracting page information
CN103020129A (en) * 2012-11-20 2013-04-03 中兴通讯股份有限公司 Text content extraction method and text content extraction device
CN103927365A (en) * 2014-04-21 2014-07-16 武汉大学 Web page time sensibility measurement method based on energy function

Also Published As

Publication number Publication date
CN106326445A (en) 2017-01-11

Similar Documents

Publication Publication Date Title
US10540804B2 (en) Selecting time-distributed panoramic images for display
US9286624B2 (en) System and method of displaying annotations on geographic object surfaces
US9436886B2 (en) System and method of determining building numbers
EP3161726A1 (en) Using image features to extract viewports from images
JP2012014544A (en) Coordinate recommendation apparatus, coordinate recommendation method and program therefor
US9396584B2 (en) Obtaining geographic-location related information based on shadow characteristics
CN103530649A (en) Visual searching method applicable mobile terminal
US11232149B2 (en) Establishment anchoring with geolocated imagery
CN111699478B (en) Image retrieval device, image retrieval method, electronic apparatus, and control method thereof
US9437004B2 (en) Surfacing notable changes occurring at locations over time
Zhuang et al. Anaba: An obscure sightseeing spots discovering system
WO2021164131A1 (en) Map display method and system, computer device and storage medium
CN104063444A (en) Method and device for generating thumbnail
US9473745B2 (en) System and method for providing live imagery associated with map locations
Han et al. Video data model and retrieval service framework using geographic information
CN106326445B (en) A kind of web page contents evaluation method based on heat transfer agent amount
EP2727073A1 (en) Spatially organized image collections on mobile devices
Seo et al. Sensor-rich video exploration on a map interface
He et al. Salient region detection combining spatial distribution and global contrast
CN104850600A (en) Method and device for searching images containing faces
Yang et al. User models of subjective image quality assessment on virtual viewpoint in free-viewpoint video system
CN109657143B (en) Method, device and equipment for pushing exhibit information and storage medium
GB2552969A (en) Image processing system
Alrababah Classification of White Blood Cells Empowered with Auto Encoder and CNN
CN116992120A (en) Account recommendation method, device, electronic equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant