CN106874311B - Method and device for determining page content attribution column - Google Patents

Method and device for determining page content attribution column Download PDF

Info

Publication number
CN106874311B
CN106874311B CN201510927617.2A CN201510927617A CN106874311B CN 106874311 B CN106874311 B CN 106874311B CN 201510927617 A CN201510927617 A CN 201510927617A CN 106874311 B CN106874311 B CN 106874311B
Authority
CN
China
Prior art keywords
column
page content
target
columns
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510927617.2A
Other languages
Chinese (zh)
Other versions
CN106874311A (en
Inventor
唐喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201510927617.2A priority Critical patent/CN106874311B/en
Publication of CN106874311A publication Critical patent/CN106874311A/en
Application granted granted Critical
Publication of CN106874311B publication Critical patent/CN106874311B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for determining a page content attribution column. Wherein, the method comprises the following steps: acquiring access data of target page content, wherein the target page content is the same page content placed in a plurality of columns in a website; counting the number of visitors who visit the target page content through each column in the plurality of columns from the visit data to obtain the number of visitors corresponding to each column; and determining the column to which the target page content belongs from the plurality of columns according to the visitor frequency corresponding to each column. The method and the device solve the technical problem that in the prior art, due to the fact that the content of a certain page in the website is placed in a plurality of columns at the same time, a user is prone to getting lost when visiting the website.

Description

Method and device for determining page content attribution column
Technical Field
The application relates to the field of computers, in particular to a method and a device for determining a page content attribution column.
Background
Currently, in some websites (such as government websites), there is an unreasonable situation that the same page content is placed in a plurality of columns (excluding the situation that the same page content is placed in a "topic" and another column at the same time), for example, an article is placed in both column 1 and column 2 of a web site, and when the user browses the web site, the article is seen when column 1 is browsed, then column 2 is browsed, the article is seen from column 2, because the article is simultaneously placed in the column 1 and the column 2, the user can easily distinguish the position of the user in the website, the user is easy to lose when accessing the website, therefore, for page content placed in multiple columns within a certain website at the same time, it is necessary to determine the most reasonable home column of the page content from the multiple columns in which the page content is located.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the application provides a method and a device for determining a page content attribution column, so as to at least solve the technical problem that in the prior art, a user is easy to get lost when visiting a website because a certain page content in the website is simultaneously placed in a plurality of columns.
According to an aspect of the embodiments of the present application, there is provided a method for determining a page content attribution column, including: acquiring access data of target page content, wherein the target page content is the same page content placed in a plurality of columns in a website; counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data to obtain the number of visitors corresponding to each column; and determining the column to which the target page content belongs from the plurality of columns according to the visitor frequency corresponding to each column.
Further, counting, from the access data, the number of visitors who access the target page content through each of the plurality of columns, and obtaining the number of visitors corresponding to each column includes: identifying a target session from the access data, wherein the target session is a session for accessing the target page content through a column entry of each of the plurality of columns; and counting the number of persons who access the target page content through each of the plurality of columns from the target session to obtain the number of access persons corresponding to each column.
Further, determining the column to which the target page content belongs from the plurality of columns according to the visitor number corresponding to each column includes: and comparing the visitor times corresponding to each column, and determining the column with the largest visitor number as the column to which the target page content belongs.
Further, before obtaining the access data of the target page content, the method further comprises: scanning the page content under each column in the website to obtain a plurality of scanned page contents; comparing any two page contents in the scanned page contents to obtain a comparison result; and obtaining the content of the target page based on the comparison result.
Further, comparing any two page contents of the scanned page contents, and obtaining a comparison result includes: judging whether the titles of any two page contents are the same; under the condition that the titles of any two page contents are judged to be the same, obtaining a comparison result that the any two page contents are the same page content; and under the condition that the titles of any two page contents are judged to be different, obtaining a comparison result that the any two page contents are different page contents.
According to another aspect of the embodiments of the present application, there is also provided an apparatus for determining a page content attribution column, including: the system comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring access data of target page content, and the target page content is the same page content placed in a plurality of columns in a website; the statistical unit is used for counting the number of visitors who visit the target page content through each column in the plurality of columns from the visit data to obtain the number of visitors corresponding to each column; and the first determining unit is used for determining the column to which the target page content belongs from the plurality of columns according to the visitor number corresponding to each column.
Further, the statistical unit includes: an identifying module, configured to identify a target session from the access data, where the target session is a session for accessing the target page content through a column entry of each of the plurality of columns; and the counting module is used for counting the number of persons who visit the target page content through each of the plurality of columns from the target session to obtain the number of visitors corresponding to each column.
Further, the first determination unit includes: and the first determining module is used for comparing the visitor times corresponding to each column and determining the column with the largest visitor number as the column to which the target page content belongs.
Further, the apparatus further comprises: the scanning unit is used for scanning the page content under each column in the website before acquiring the access data of the target page content to obtain a plurality of scanned page contents; the comparison unit is used for comparing any two page contents in the scanned page contents to obtain a comparison result; and the second determining unit is used for obtaining the target page content based on the comparison result.
Further, the comparison unit includes: the judging module is used for judging whether the titles of any two page contents are the same; the second determining module is used for obtaining a comparison result that any two page contents are the same page content under the condition that the titles of the any two page contents are judged to be the same; and the third determining module is used for obtaining a comparison result that any two page contents are different page contents under the condition that the titles of any two page contents are judged to be different.
In the embodiment of the application, access data of target page content is acquired, wherein the target page content is the same page content placed in a plurality of columns in a website; counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data to obtain the number of visitors corresponding to each column; determining the column mode to which the target page content belongs from the plurality of columns according to the visitor frequency corresponding to each column, by counting the number of visitors of the same page content in different columns simultaneously placed in a plurality of columns, and determines the column to which the page content should belong most according to the number of visitors, because the number of visitors can indicate the preference tendency of the user when visiting the page content, so the column to which the page content should belong can be reasonably determined according to the visitor number, and the page content can be attributed to the most reasonable column for the page content, so that the technical problem that in the prior art, a user is easy to lose when accessing the website because a certain page content in the website is simultaneously placed in a plurality of columns is solved, and the technical effect of improving the comfort level of the user in the process of accessing the website is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a flowchart of a method for determining a page content attribution column according to an embodiment of the present application; and
fig. 2 is a schematic diagram of an apparatus for determining a page content attribution column according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In accordance with an embodiment of the present application, there is provided a method embodiment of a method for determining page content attribution column, it should be noted that the steps illustrated in the flowchart of the drawings may be executed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be executed in an order different from that herein.
Fig. 1 is a flowchart of a method for determining a page content attribution column according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:
step S102, obtaining access data of target page content, wherein the target page content is the same page content placed in a plurality of columns in the website.
For example, if a specific article about sports star a is simultaneously placed in column 1 and column 2 of website B, the specific article is the target page content.
Specifically, the access data of the target page content may be obtained from the log of the website.
And step S104, counting the number of visitors who visit the target page content through each column in the plurality of columns from the visit data to obtain the number of visitors corresponding to each column.
Specifically, the visitor number corresponding to each column is the visitor number for visiting the target page content through each column.
And S106, determining the column to which the target page content belongs from the plurality of columns according to the visitor frequency corresponding to each column.
Specifically, the above-mentioned belonging column is the column in which the target page content is most reasonably placed, so that the target page content can be placed only in the column to which the target page content belongs, and thus the user is not prone to getting lost when accessing the website in which the target page content is located.
In the embodiment of the application, by counting the number of visitors of the same page content in a plurality of columns in a website and determining the column to which the page content should belong most according to the number of visitors, the number of visitors can indicate the favorite tendency of a user when the user visits the page content, so that the column to which the page content should belong can be reasonably determined according to the number of visitors, the page content can be attributed to the most reasonable column for the page content, the technical problem that the user is easy to get lost when visiting the website due to the fact that the page content in the website is simultaneously placed in a plurality of columns in the prior art is solved, and the technical effect of improving the comfort degree of the user when visiting the website is achieved.
It should be noted that, for each target page content in the website, the columns to which the target page content belongs may be determined by executing steps S102 to S106.
It should be noted that if a certain page content is placed in a certain "topic" and a certain column in the website at the same time, the page content is not in the scope of the solution of the present application.
Optionally, counting, from the access data, the number of visitors who access the target page content through each of the plurality of columns, and obtaining the number of visitors corresponding to each column includes: identifying a target session from the access data, wherein the target session is a session for accessing the content of the target page through a column entry of each of the plurality of columns; and counting the number of persons who access the target page content through each of the plurality of columns from the target session to obtain the number of access persons corresponding to each column.
Specifically, all sessions for accessing the target page content are identified from the access data, then a session (i.e., a target session) for accessing the target page content through a column entry of each of the plurality of columns is identified from all the sessions, and then the number of persons accessing the target page content through the column entry of each column is counted from all the identified target sessions, so as to obtain the number of access persons corresponding to the target column.
The session is one of the access data, and one session records a complete access path of a certain user for accessing the website, so that whether the user accesses the target page content through the column entry or not can be obtained according to the complete access path, and the target session can be identified from the access data through the method.
Optionally, in this embodiment of the application, determining, from the multiple columns, a column to which the target page content belongs according to the visitor number corresponding to each column includes: and comparing the visitor times corresponding to each column, and determining the column with the most visitor times as the column to which the target page content belongs.
Specifically, the higher the number of visitors corresponding to a column, the higher the frequency for accessing the target page content through the column. The column with the highest number of visitors corresponds to the column which the user thinks the target page content should belong to most.
In the embodiment of the application, because the number of the visitors can indicate the preference tendency of the user when the user visits the page content, the column to which the page content belongs can be reasonably determined according to the number of the visitors, a scientific decision method is provided for a website manager to judge the most reasonable column to which the same page content is simultaneously placed in a plurality of columns, and the effect of improving the satisfaction degree of the website manager is achieved.
Optionally, in this embodiment of the present application, before obtaining the access data of the target page content, the method further includes: scanning page contents under each column in a website to obtain a plurality of scanned page contents; comparing any two page contents in the scanned page contents to obtain a comparison result; and obtaining the content of the target page based on the comparison result.
Specifically, there are two types of comparison results, one is that any two page contents are the same page content, and the other is that any two page contents are different page contents. For example: if there are 10 columns in a certain website, and the 10 columns contain 60 page contents, the 60 page contents are scanned, and any two page contents in the scanned 60 page contents are compared to obtain a comparison result. And if the comparison result of any two scanned page contents is the same, the two page contents are the target page contents.
Specifically, the titles, web addresses, and other contents of any two page contents may be compared. If the titles are compared, the titles in any two page contents need to be compared word by word, and the web address comparison mode is the same as that of the titles, and the description is not repeated here.
In the embodiment of the application, the target page content can be determined more quickly by scanning the page content under each column in the website and comparing the scanned page content.
Optionally, comparing any two page contents of the scanned page contents, and obtaining a comparison result includes: judging whether the titles of any two page contents are the same; under the condition that the titles of any two page contents are judged to be the same, a comparison result that any two page contents are the same page content is obtained; and under the condition that the titles of any two page contents are judged to be different, obtaining a comparison result that any two page contents are different page contents.
If the titles of any two page contents are the same (that is, the two page contents are the comparison result of the same page content), it is determined that the two page contents are the target page content.
If the scanned page content 11 is compared with the scanned page content 12, if the title of the page content 11 is the same as the title of the page content 12, which means that the page content 11 is the same as the page content 12, then a comparison result that the page content 11 and the page content 12 are the same page content is obtained; if the title of the page content 11 is different from the title of the page content 12, which indicates that the page content 11 is different from the page content 12, a comparison result that the page content 11 and the page content 12 are different page contents is obtained.
In the embodiment of the application, because the title of the page content does not change no matter how many columns are placed in the same page content at the same time, the comparison result of whether the two page contents are the same or not can be obtained more quickly by comparing the titles of any two page contents.
According to an embodiment of the present application, there is further provided a device for determining a page content attribution column, where the device for determining a page content attribution column is used to execute the method for determining a page content attribution column provided in the embodiment of the present application, and the following specifically introduces the device for determining a page content attribution column provided in the embodiment of the present application:
fig. 2 is a schematic diagram of a device for determining a page content attribution column according to an embodiment of the present application, and as shown in fig. 2, the device mainly includes an obtaining unit 21, a counting unit 23, and a first determining unit 25, where:
the obtaining unit 21 is configured to obtain access data of target page content, where the target page content is the same page content placed in multiple columns in a website.
Specifically, the access data of the target page content may be obtained from the log of the website.
The counting unit 23 is configured to count, from the access data, the number of visitors who access the target page content through each of the plurality of columns, and obtain the number of visitors corresponding to each column.
Specifically, the visitor number corresponding to each column is the visitor number for visiting the target page content through each column.
The first determining unit 25 is configured to determine, according to the number of visitors corresponding to each column, a column to which the target page content belongs from the multiple columns.
Specifically, the above-mentioned belonging column is the column in which the target page content is most reasonably placed, so that the target page content can be placed only in the column to which the target page content belongs, and thus the user is not prone to getting lost when accessing the website in which the target page content is located.
In the embodiment of the application, by counting the number of visitors of the same page content in a plurality of columns in a website and determining the column to which the page content should belong most according to the number of visitors, the number of visitors can indicate the favorite tendency of a user when the user visits the page content, so that the column to which the page content should belong can be reasonably determined according to the number of visitors, the page content can be attributed to the most reasonable column for the page content, the technical problem that the user is easy to get lost when visiting the website due to the fact that the page content in the website is simultaneously placed in a plurality of columns in the prior art is solved, and the technical effect of improving the comfort degree of the user when visiting the website is achieved.
It should be noted that, for each target page content in the website, the column to which the target page content belongs may be determined by invoking the obtaining unit, the counting unit, and the first determining unit.
It should be noted that if a certain page content is placed in a certain "topic" and a certain column in the website at the same time, the page content is not in the scope of the solution of the present application.
Optionally, in an embodiment of the present application, the statistical unit includes an identification module and a statistical module. The identification module is used for identifying a target session from the access data, wherein the target session is a session for accessing the content of a target page through a column entry of each column in a plurality of columns; and the counting module is used for counting the number of persons who access the target page content through each of the plurality of columns from the target session to obtain the number of access persons corresponding to each column.
Specifically, all sessions for accessing the target page content are identified from the access data, then a session (i.e., a target session) for accessing the target page content through a column entry of each of the plurality of columns is identified from all the sessions, and then the number of persons accessing the target page content through the column entry of each column is counted from all the identified target sessions, so as to obtain the number of access persons corresponding to the target column.
The session is a kind of access data, and one session records a complete access path of a certain user to access a website, so that whether the user accesses the target page content through a column entry can be obtained according to the complete access path, and the target session can be identified.
Optionally, in an embodiment of the present application, the first determining unit includes a first determining module. The first determining module is used for comparing the visitor times corresponding to each column, and determining the column with the largest visitor number as the column to which the target page content belongs.
Specifically, the higher the number of visitors corresponding to a column, the higher the frequency for accessing the target page content through the column. The column with the highest number of visitors corresponds to the column which the user thinks the target page content should belong to most.
In the embodiment of the application, because the number of the visitors can indicate the preference tendency of the user when the user visits the page content, the column to which the page content belongs can be reasonably determined according to the number of the visitors, a scientific decision method is provided for a website manager to judge the most reasonable column to which the same page content is simultaneously placed in a plurality of columns, and the effect of improving the satisfaction degree of the website manager is achieved.
Optionally, in an embodiment of the present application, the determining apparatus further includes a scanning unit, a comparing unit, and a second determining unit. The scanning unit is used for scanning the page content under each column in the website before acquiring the access data of the target page content to obtain a plurality of scanned page contents; the comparison unit is used for comparing any two page contents in the scanned page contents to obtain a comparison result; the second determining unit is used for obtaining the target page content based on the comparison result.
Specifically, there are two types of comparison results, one is that any two page contents are the same page content, and the other is that any two page contents are different page contents. For example: if there are 10 columns in a certain website, and the 10 columns contain 60 page contents, the 60 page contents are scanned, and any two page contents in the scanned 60 page contents are compared to obtain a comparison result. And if the comparison result of any two scanned page contents is the same, the two page contents are the target page contents.
Specifically, the titles, web addresses, and other contents of any two page contents may be compared. If the titles are compared, the titles in any two page contents need to be compared word by word, and the web address comparison mode is the same as that of the titles, and the description is not repeated here.
In the embodiment of the application, the target page content can be determined more quickly by scanning the page content under each column in the website and comparing the scanned page content.
Optionally, in an embodiment of the present application, the comparing unit includes a determining module, a second determining module, and a third determining module. The judging module is used for judging whether the titles of any two page contents are the same; the second determining module is used for obtaining a comparison result that any two page contents are the same page content under the condition that the titles of any two page contents are judged to be the same; the third determining module is used for obtaining a comparison result that any two page contents are different page contents under the condition that the titles of any two page contents are judged to be different.
In the embodiment of the application, because the title of the page content does not change no matter how many columns are placed in the same page content at the same time, the comparison result of whether the two page contents are the same or not can be obtained more quickly by comparing the titles of any two page contents.
The device for determining the page content attribution column comprises a processor and a memory, wherein the acquiring unit, the counting unit, the first determining unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, the technical problem that in the prior art, because the content of a certain page in the website is simultaneously placed in a plurality of columns, a user is easy to lose when accessing the website is solved by adjusting the kernel parameters, and the comfort level of the user in the process of accessing the website is improved.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
The present application further provides an embodiment of a computer program product, which, when being executed on a data processing device, is adapted to carry out program code for initializing the following method steps: acquiring access data of target page content, wherein the target page content is the same page content placed in a plurality of columns in a website; counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data to obtain the number of visitors corresponding to each column; and determining the column to which the target page content belongs from the plurality of columns according to the visitor frequency corresponding to each column.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (6)

1. A method for determining a page content attribution column is characterized by comprising the following steps:
acquiring access data of target page content, wherein the target page content is the same page content placed in a plurality of columns in a website;
counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data to obtain the number of visitors corresponding to each column;
determining the column to which the target page content belongs from the plurality of columns according to the visitor number corresponding to each column, wherein determining the column to which the target page content belongs from the plurality of columns according to the visitor number corresponding to each column comprises: comparing the number of visitors corresponding to each column, and determining the column with the largest number of visitors as the column to which the target page content belongs;
wherein, counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data, and obtaining the number of visitors corresponding to each column includes:
identifying a target session from the access data, wherein the target session is a session for accessing the target page content through a column entry of each of the plurality of columns;
and counting the number of persons who access the target page content through each of the plurality of columns from the target session to obtain the number of access persons corresponding to each column.
2. The method of claim 1, wherein prior to obtaining access data for the target page content, the method further comprises:
scanning the page content under each column in the website to obtain a plurality of scanned page contents;
comparing any two page contents in the scanned page contents to obtain a comparison result;
and obtaining the content of the target page based on the comparison result.
3. The method of claim 2, wherein comparing any two of the plurality of scanned page contents to obtain a comparison result comprises:
judging whether the titles of any two page contents are the same;
under the condition that the titles of any two page contents are judged to be the same, obtaining a comparison result that the any two page contents are the same page content;
and under the condition that the titles of any two page contents are judged to be different, obtaining a comparison result that the any two page contents are different page contents.
4. An apparatus for determining a page content attribution field, comprising:
the system comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring access data of target page content, and the target page content is the same page content placed in a plurality of columns in a website;
the statistical unit is used for counting the number of visitors who visit the target page content through each column in the plurality of columns from the visit data to obtain the number of visitors corresponding to each column;
the first determining unit is used for determining the columns to which the target page content belongs from the plurality of columns according to the visitor times corresponding to each column; the first determination unit includes: the first determining module is used for comparing the visitor times corresponding to each column and determining the column with the largest visitor number as the column to which the target page content belongs;
wherein, counting the number of visitors who visit the target page content through each of the plurality of columns from the visit data, and obtaining the number of visitors corresponding to each column includes:
identifying a target session from the access data, wherein the target session is a session for accessing the target page content through a column entry of each of the plurality of columns;
and counting the number of persons who access the target page content through each of the plurality of columns from the target session to obtain the number of access persons corresponding to each column.
5. The apparatus of claim 4, further comprising:
the scanning unit is used for scanning the page content under each column in the website before acquiring the access data of the target page content to obtain a plurality of scanned page contents;
the comparison unit is used for comparing any two page contents in the scanned page contents to obtain a comparison result;
and the second determining unit is used for obtaining the target page content based on the comparison result.
6. The apparatus of claim 5, wherein the comparison unit comprises:
the judging module is used for judging whether the titles of any two page contents are the same;
the second determining module is used for obtaining a comparison result that any two page contents are the same page content under the condition that the titles of the any two page contents are judged to be the same;
and the third determining module is used for obtaining a comparison result that any two page contents are different page contents under the condition that the titles of any two page contents are judged to be different.
CN201510927617.2A 2015-12-14 2015-12-14 Method and device for determining page content attribution column Active CN106874311B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510927617.2A CN106874311B (en) 2015-12-14 2015-12-14 Method and device for determining page content attribution column

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510927617.2A CN106874311B (en) 2015-12-14 2015-12-14 Method and device for determining page content attribution column

Publications (2)

Publication Number Publication Date
CN106874311A CN106874311A (en) 2017-06-20
CN106874311B true CN106874311B (en) 2020-09-15

Family

ID=59178584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510927617.2A Active CN106874311B (en) 2015-12-14 2015-12-14 Method and device for determining page content attribution column

Country Status (1)

Country Link
CN (1) CN106874311B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1945575A (en) * 2006-10-27 2007-04-11 北京金山软件有限公司 Method and system for regulating colume structure
CN101188521A (en) * 2007-12-05 2008-05-28 北京金山软件有限公司 A method for digging user behavior data and website server
CN101261716A (en) * 2008-04-23 2008-09-10 深圳市迅雷网络技术有限公司 Method and device for identifying mapping relation of advertisement and its distribution site
CN101782909A (en) * 2009-01-19 2010-07-21 杨云国 Search engine based on operation intention of user
CN101826104A (en) * 2010-04-02 2010-09-08 南京邮电大学 Method for realizing website navigability based on continuous time Markov chain

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001796A (en) * 2012-11-13 2013-03-27 北界创想(北京)软件有限公司 Method and device for processing weblog data by server
US10423719B2 (en) * 2013-02-19 2019-09-24 International Business Machines Corporation Dynamic loading of tabular data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1945575A (en) * 2006-10-27 2007-04-11 北京金山软件有限公司 Method and system for regulating colume structure
CN101188521A (en) * 2007-12-05 2008-05-28 北京金山软件有限公司 A method for digging user behavior data and website server
CN101261716A (en) * 2008-04-23 2008-09-10 深圳市迅雷网络技术有限公司 Method and device for identifying mapping relation of advertisement and its distribution site
CN101782909A (en) * 2009-01-19 2010-07-21 杨云国 Search engine based on operation intention of user
CN101826104A (en) * 2010-04-02 2010-09-08 南京邮电大学 Method for realizing website navigability based on continuous time Markov chain

Also Published As

Publication number Publication date
CN106874311A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
CN104809154B (en) The method and device recommended for information
CN106776989B (en) Webpage information display method and device
CN104391979B (en) Network malice reptile recognition methods and device
WO2019056721A1 (en) Information pushing method, electronic device and computer storage medium
CN108320213A (en) Electric business Method of Commodity Recommendation and electric business Platform Server
CN106874165B (en) Webpage detection method and device
CN106919611B (en) Product information pushing method and device
WO2017218526A1 (en) Methods and systems for processing and displaying review data based on one or more stored relationship associations and one or more rule sets
JP7029003B2 (en) Password protection Question setting method and device
CN109388702B (en) Reading interaction method, electronic equipment and computer storage medium
CN106936778B (en) Method and device for detecting abnormal website traffic
CN111723083B (en) User identity recognition method and device, electronic equipment and storage medium
CN102866885A (en) Method and device for confirming clicking position in webpage
CN106874311B (en) Method and device for determining page content attribution column
CN107357795B (en) Method and device for monitoring association degree between websites
CN110751526A (en) Advertisement pushing method and device, computer equipment and storage medium
WO2018077059A1 (en) Barcode identification method and apparatus
CN106611010B (en) Method and device for determining webpage loading speed
CN106874299A (en) Page detection method and device
CN106874310B (en) Website column name monitoring method and device
CN106708878B (en) Terminal identification method and device
CN106874313B (en) Website column name monitoring method and device
CN108073588B (en) Column information extraction method and device
CN111461545B (en) Method and device for determining machine access data
CN106874300B (en) Webpage identification method and device and setting rate determination method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant