CN111400575B - User identification generation method, user identification method and device - Google Patents

User identification generation method, user identification method and device Download PDF

Info

Publication number
CN111400575B
CN111400575B CN202010190953.4A CN202010190953A CN111400575B CN 111400575 B CN111400575 B CN 111400575B CN 202010190953 A CN202010190953 A CN 202010190953A CN 111400575 B CN111400575 B CN 111400575B
Authority
CN
China
Prior art keywords
page
position data
cursor
user
user identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010190953.4A
Other languages
Chinese (zh)
Other versions
CN111400575A (en
Inventor
付星昱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010190953.4A priority Critical patent/CN111400575B/en
Publication of CN111400575A publication Critical patent/CN111400575A/en
Application granted granted Critical
Publication of CN111400575B publication Critical patent/CN111400575B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/972Access to data in other repository systems, e.g. legacy data or dynamic Web page generation

Abstract

The user identification generation method comprises the steps of judging whether a currently displayed page is a content type typesetting page, acquiring stay position data of a cursor in the page and obtaining target position data when the page is the content type typesetting page, and generating a user identification according to coordinates obtained by the target position data.

Description

User identification generation method, user identification method and device
Technical Field
The embodiment of the application relates to the technical field of internet, in particular to a user identification generation method, a user identification method, a device, a terminal, a server and a computer readable storage medium thereof.
Background
At present, a website or an advertisement alliance identifies a terminal by using a terminal identification technology, and the terminal identification technology can correlate all the behaviors of a user for web browsing through a browser, so that each individual can be accurately positioned on a network, the data of the individuals can be collected, and personalized service or other targeted activities can be realized through data analysis.
However, the existing terminal identification technology can only locate the terminal through static information of the browser, and cannot accurately identify the user.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The application provides a user identification generation method, a user identification method, a device, a terminal, a server and a computer readable storage medium thereof, which can improve the accuracy of user identification.
According to a first aspect of the present application, there is provided a user identifier generating method, including:
judging whether the currently displayed page is a content typesetting page or not;
when the page is a content typesetting page, acquiring stay position data of a cursor in the page, wherein the stay position data is position data of which the stay time of the cursor in the page is longer than a first threshold value;
Obtaining target position data according to the stay position data of the cursor in the page;
and generating a user identification for identifying the user according to the coordinates obtained by the target position data.
According to a second aspect of the present application, there is provided a user identification method, comprising,
acquiring a user identifier from a terminal, wherein the user identifier is generated according to a target stay point of the terminal, and the target stay point is obtained by screening a stay point set generated when a cursor stays in a webpage and is used for representing the behavior habit of a user for browsing the webpage;
matching the user identification with a preset user identification feature library;
and when the matching is successful, the identification of the terminal user is completed.
According to a third aspect of the present application, there is provided a user identification generating apparatus, comprising,
the judging module is used for judging whether the currently displayed page is a content type typesetting page or not, wherein a content display area of the content type typesetting page is positioned in the middle of the content type typesetting page along the width direction, and a white-keeping area exists between two sides of the content display area along the width direction and the edge of the content type typesetting page;
the position data acquisition module is used for acquiring first position data of a cursor in the content type typesetting page when the page is the content type typesetting page, wherein the first position data is position data of which the length is larger than a first threshold value when the cursor stays in the content type typesetting page;
The target position data generation module is used for obtaining target position data according to the stay position data of the cursor in the page;
and the identification generation module is used for generating a user identification for identifying the user according to the coordinates obtained by the target position data.
According to a fourth aspect of the present application, there is provided a user identification device comprising:
the user identification receiving module is used for obtaining the user identification generated by the user identification device in the fourth aspect of the application;
the matching module is used for matching the user identification in a preset user identification feature library;
the confirmation module is used for completing the identification of the terminal user after the successful matching;
and the new building module is used for storing the user identification as a new user identification characteristic into the user identification characteristic library after the matching fails.
According to a fifth aspect of the present application, there is provided a user identification generating apparatus, including:
at least one memory;
at least one processor;
at least one program;
the program is stored in a memory, and the processor executes the at least one program to implement the user identification generation method of the first aspect.
According to a sixth aspect of the present application, there is provided a user identification device comprising:
at least one memory;
at least one processor;
at least one program;
the program is stored in a memory, and the processor executes the at least one program to implement the user identification method of the second aspect.
According to a seventh aspect of the present application, there is provided a terminal, including a user identifier generating device according to the fifth aspect of the present application.
According to an eighth aspect of the present application, there is provided a server comprising the user identification device of the sixth aspect of the present application.
According to a ninth aspect of the present application, there is provided a computer readable storage medium storing computer executable instructions for performing the user identification generation method of the first aspect of the present application, or the user identification method of the second aspect.
According to the technical scheme, whether the currently displayed page is the content type typesetting page is judged, when the page is the content type typesetting page, the stay position data of the cursor in the page is obtained, the target position data is obtained, the user identification is generated according to the coordinates obtained by the target position data, and because the typesetting design of the content type typesetting page aims at showing text content, a user has specific behavior habit when browsing the typesetting page, the target position data can accurately reflect the behavior habit of the user when browsing the webpage, the user identification generated by the target position data can be directly related with the behavior habit of the user and does not depend on static information of a browser, and the accuracy of user identification can be improved through the user identification.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the technical aspects of the present application, and are incorporated in and constitute a part of this specification, illustrate the technical aspects of the present application and together with the examples of the present application, and not constitute a limitation of the technical aspects of the present application.
FIG. 1 is a schematic diagram of a prior art browser fingerprint generation process;
FIG. 2 is a comparative schematic diagram of a prior art browser fingerprint generation process;
FIG. 3 is a system architecture diagram of an implementation environment for a user identification generation method provided by an exemplary embodiment of the present application;
FIG. 4 is a flowchart of a user identification generation method provided by an exemplary embodiment of the present application;
FIG. 5 is a schematic diagram of a content typeset page;
FIG. 6 is a flowchart of a specific method of step 401 of FIG. 4;
FIG. 7 is a flowchart of an exemplary method of another embodiment of step 401 of FIG. 4;
FIG. 8 is a flowchart of an exemplary method of another embodiment of step 401 of FIG. 4;
FIG. 9 is a schematic diagram of a track formed when a cursor moves in a method for generating a user identifier according to an exemplary embodiment of the present application;
FIG. 10 is a flowchart of a specific method of step 402 of FIG. 4;
FIG. 11 is a flowchart of a specific method for acquiring the dwell position data of the cursor in the page in step 402 of FIG. 4;
FIG. 12 is a method flow diagram of one embodiment of step 403 of FIG. 4;
FIG. 13 is a method flow diagram of another embodiment of step 403 of FIG. 4;
FIG. 14 is a schematic diagram of another embodiment of a content typeset page;
FIG. 15 is a schematic diagram of another embodiment of a content typeset page;
FIG. 16 is a flowchart of a specific method of step 404 of FIG. 4;
FIG. 17 is a schematic diagram of a centroid calculation method provided in an exemplary embodiment of the present application;
FIG. 18 is a flowchart of a user identification generation method provided by an exemplary embodiment of the present application;
FIG. 19 is a flowchart of a user identification generation method provided by an exemplary embodiment of the present application;
FIG. 20 is a detailed method flow diagram of step 1902 in FIG. 19;
FIG. 21 is a flowchart of a user identification method provided in an exemplary embodiment of the present application;
FIG. 22 is a flowchart of a method for generating a user identifier according to an exemplary embodiment of the present application;
FIG. 23 is an interface schematic diagram of an application scenario provided in an exemplary embodiment of the present application;
FIG. 24 is a block diagram of a user identification generation arrangement according to an exemplary embodiment of the present application;
FIG. 25 is a block diagram of a user identification device according to an exemplary embodiment of the present application;
FIG. 26 is a block diagram illustrating another embodiment of a subscriber identity generation apparatus of the present application;
FIG. 27 is a block diagram of another embodiment of a subscriber identification device of the present application;
fig. 28 is a block diagram of a terminal according to an exemplary embodiment of the present application;
fig. 29 is a block diagram of a server according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
First, several nouns referred to in this application are parsed:
the user: the user is a natural person who operates the terminal, and can operate the computer device according to own operation habits, and the user in the embodiment of the present application represents the same natural person unless otherwise specified.
And (3) a browser: refers to a browsing program (App) capable of displaying hypertext markup language (Hyper Text Markup Language, HTML) file contents provided by a web server or a file system and enabling a user to interact with the file. The browser can be an independently running App or a built-in browser embedded in other apps.
Browser fingerprint: based on the visible configuration information and setting information of the browser, for example, the information such as the kernel information, time zone, language and screen resolution of the browser, the user processes the obtained character string by a certain algorithm based on the above information.
Static information of browser: the browser configuration information which does not change along with the replacement of the user comprises characteristic identifiers of the browser, such as hardware types, an operating system, a user agent, system fonts, languages, screen resolutions, browser plug-ins, browser extensions, browser settings, time zone differences and the like, and also comprises processing results of data when the browser displays a page, such as processing results of displaying pictures and playing audio and video to a data stream.
A cursor: providing a visual graphic facilitates user interaction with the displayed content on the screen, and the cursor may be controlled by a peripheral device, such as a mouse, touchpad, or the like. The user can control the cursor to move through the peripheral device so as to change the position of the display content, or send an operation instruction to the cursor through the peripheral device so as to enable the display content to execute corresponding actions.
Typesetting: and arranging the elements such as characters, pictures and frames according to a certain position to form an integral pattern.
Page: a carrier for presenting information to a user and capable of interacting with the user, such as a web page, document, etc.
Content display area: and densely distributed areas of text, pictures and the like.
White area: the blank area without extra elements is not specially used for the area with white color, and the blank area can surround the elements in the page and can be positioned between the elements in the page. For example, a white-reserved area of a certain size can be formed by setting the outer edge distance of an element in a web page.
Content type-setting page: page typesetting is meant to be a page mainly aimed at displaying text content. The content display area of the content typesetting page is positioned in the middle of the page, and other navigation guiding elements are fewer, so that a larger blank area exists between two sides of the content display area along the width direction and the edge of the page, and the main attention of a user is focused on browsing the content display area of the current page.
Distribution density: number of target objects per unit area.
User identification: during the interaction of the server side and the user side, the user identity can be used to identify the user identity so that the action performed by the user can be correctly attributed to the user, and the user identity can be stored in the form of a character string, for example.
At present, a website or an advertisement alliance identifies a terminal by using a terminal identification technology, obtains the setting, core information, time zone, language, resolution information and the like of a user using a browser, and performs certain algorithm processing based on the information to obtain a value, wherein the value is the fingerprint of the browser, and the website or the advertisement alliance can locate individuals of the user through the fingerprint of the browser, so that the data of the individuals can be collected, and personalized service or other targeted activities can be realized through data analysis.
Browser fingerprints can be implemented in different ways, for example, based on characteristic identifiers of a browser, such as a hardware type, an operating system, a user agent, a system font, a language, a screen resolution, a browser plug-in, a browser extension, a browser setting, a time zone difference, and other information, and the fingerprint information has low accuracy, and has a high collision probability like the height, the age, and the like of a human being.
At present, more accurate identification of browser fingerprints can be realized, for example, canvas fingerprints (Canvas fingerprints), and due to the drawing operation of drawing elements of the same HTML5 (Hypertext Marked Language, namely, the fifth generation hypertext markup language), the generated picture contents are not completely the same on different operating systems and different browsers. In picture format, different browsers use different graphics processing engines, different picture export options, different default compression levels, and so on. At the pixel level, operating systems each use different settings and algorithms to perform antialiasing and subpixel rendering operations. Even though the same drawing operation is performed, the CRC check of the generated picture data is different, and the canvas fingerprint is generated based on the picture CRC check information displayed by the browser, so that the user can be positioned to the browser used by the user more accurately.
However, the fingerprints are all performed based on the browser, and different fingerprint information is generated in different browsers of the same device, so that as a result, when the same User uses different browsers of the same computer, the browser fingerprint information collected by the server is different, the User cannot be uniquely identified, and further, the behavior of the User cannot be effectively analyzed, for example, as shown in fig. 1, when the User operates in browser a, the browser fingerprint a is obtained by processing the User Agent (UA), time zone information, browser system platform information, screen resolution and Canvas fingerprint (Canvas) information based on the browser a through a feature value algorithm, and is: 2ds234vdg345. However, for the same user as shown with reference to FIG. 2, when the user is operating another browser B, the browser fingerprint B is processed by the eigenvalue algorithm as the user agent, time zone information, browser system platform information, screen resolution, and canvas fingerprint information have changed: 53g721635q. At this time, the browser fingerprint B cannot correspond to the browser fingerprint a, and the user cannot be uniquely identified due to the difference of the browser fingerprint information collected by the server, so that the behavior of the user cannot be effectively analyzed, and similarly, the user cannot be accurately identified after the user replaces the computer device.
In addition, for the same browser, the user can deliberately modify the device information of the browser and randomly generate new device information every time of access, so that a new browser fingerprint is generated every time of access, and accurate identification of the user is difficult for a website or an advertising alliance.
In summary, since the existing browser fingerprint technology can only be located to the terminal through the static information of the browser and does not establish an association with the user, the user cannot be accurately identified.
Therefore, the embodiment of the application provides a user identification generation method, a user identification method, a device, a terminal, a server and a computer readable storage medium thereof, which can generate a user identification convenient for accurately identifying a user, wherein the user identification does not depend on static information of terminal equipment, and the user can accurately identify the user by using the user identification even if the user changes a browsing program or a terminal.
The user identifier generating method provided by the embodiment of the application can be applied to an application environment shown in fig. 3, and the application environment comprises: the terminal 11, the server 12 and the communication network 13, wherein the terminal 11 and the server 12 are connected with each other through the communication network 13, and the connection can be a wired connection or a wireless connection.
The terminal 11 may be an electronic terminal device such as a desktop computer, a notebook computer, a smart phone, a tablet computer, an electronic book reader, or the like, where the terminal 11 includes a display device, and the display device may be a separate display, for example, an external display, and the external display may be connected to the electronic terminal device such as the desktop computer, the notebook computer, the smart phone, the tablet computer, or the like, so as to display or expand display data of the display terminal 11, and the display may also be a display integrated with the terminal 11, for example, a built-in screen of the notebook computer, the smart phone, or the tablet computer; the terminal 11 further includes a pointing device, which may be a built-in pointing device, such as a touch pad on a notebook computer, or may be a device that is external to the terminal 11 and is communicatively connected to the terminal, such as a mouse, an external touch pad, or a keyboard with a touch pad. The terminal 11 has installed therein an Operating System (OS), at least one browsing program (App), and at least one execution program; the operating system may be a Microsoft Windows (R) operating system (Microsoft Windows OS) or an apple operating system (Mac OS), etc.; and a display program for displaying a cursor on the display device by a user is arranged in the operating system, and the user controls the position change of the cursor on the display device through pointer equipment.
The browsing program may be a browser, instant messaging software, shopping software, news software, document processing software, etc., wherein the browsing program has a browsing window, and the user reads or views page content in the window through the browsing window, where the page content may be web page content provided by the server 12, or local page content opened by the terminal 11, such as a local document, a cached web page local document, etc.
The execution program may be a built-in function module of a browser program, for example, a built-in software function module of a browser or document processing software, or may be a third party browser program plug-in, for example, a browser plug-in that a user actively installs in the browser or a plug-in that the user automatically installs to the browser when browsing a web page, or may be independent software with respect to the browser program.
The servers 12 may be in the form of individual or clusters, i.e., the number may be one or more, and the servers 12 may be set by a web page provider, a browser provider, or an advertisement provider, for example.
The terminal 11 is connected to the server 12 through a communication network 13, and the communication network 13 may be a wired network or a wireless network, and the wired network may be a metropolitan area network, a local area network, an optical fiber network, or the like; the wireless network may be a mobile communication network or a wireless fidelity network, etc.
Illustratively, the user opens a browsing window of the browsing program in the terminal 11, taking browsing a web page as an example, the operating system in the terminal 11 sends an access request to the server 12 through the communication network 13, the server 12 responds to and returns web page data to the terminal 11 through the communication network 13, and displays the web page data in the browsing window of the browsing program, and the user operates browsing a page and operates the page through the pointer device, such as moving, switching the page, dragging, clicking, double clicking, scrolling, and the like. When the page currently browsed by the user is a content typesetting page, the executive program obtains the stay position data of the cursor in the page, processes the stay position data to obtain target position data, and generates a user identification for identifying the user according to the coordinates of the target position data. The execution program of the terminal 11 sends the user identification to the server 12 via the communication network 13, and the server 12 identifies the user based on the user identification.
When the browsing program is a browser or an embedded browsing window and the page is a local document or a cached web page local document, the terminal 11 reads the document data in the internal storage device or the external storage device and displays the page corresponding to the document data in the browsing window.
Fig. 4 is a flowchart of a method for generating a user identifier according to an exemplary embodiment of the present application, where the method is applied to a terminal installed with a browsing program, and the method specifically includes steps 401, 402, 403 and 404.
Step 401, determining whether the currently displayed page is a content type typesetting page.
In the process of browsing the page, the executive program judges the typesetting type of the current page so as to judge whether the current displayed page is a content typesetting page or not. In an embodiment, referring to fig. 5, a content display area 501 of the content typesetting page is located in the middle of the page, and a larger white area 502 exists between two sides of the content display area 501 along the width direction and the edge of the page, at this time, the main attention of the user is to browse the content display area of the current page, and the stay position of the cursor on the page has habitual distribution, so that when the user browses the typesetting type page, in order to facilitate pulling of the progress bar or avoid the influence of the cursor on the reading of the content display area, the cursor stays in the white area more, and the distribution of the stay position of the cursor is more regular in the white area.
And step 402, when the page is a content typesetting page, acquiring stay position data of a cursor in the page, wherein the stay position data is position data of which stay time of the cursor in the page is longer than a first threshold value.
When the executive program judges that the page browsed by the user is a content typesetting page, the position of the cursor in the page is obtained, when the duration of the cursor kept at the same position is larger than a first threshold value, the cursor stays at the current position, and the executive program identifies the stay positions of the cursor to obtain stay position data. For example, if the cursor is an arrow, the point at the forefront of the arrow is the current position of the cursor, when the stay time of the cursor at the position is longer than a first threshold value, the position data of the point is recorded and acquired by an executing program, at this time, the point can be intuitively regarded as a stay point on a page, when a user browses the page, the stay point is continuously acquired by the executing program, the executing program forms stay position data according to the acquired position data of the cursor, and the stay position data reflects a position data set of the cursor which stays in the content typesetting page and is longer than the first threshold value, and also can be regarded as a set of the stay points. Because the user has a fixed behavior habit when browsing the content type typesetting page, the distribution of the stay positions of the cursor on the content type typesetting page can objectively reflect the behavior habit of the user.
And step 403, obtaining target position data according to the stay position data of the cursor in the page.
The executing program identifies and acquires the stay position data of the cursor in the page, which is equivalent to screening the stay position data in the moving data of the cursor, and the executing program also needs to further process the stay position data in the page to obtain target position data, or it can be understood that the target position data is obtained by extracting the position data corresponding to the stay point of the cursor and performing data processing, and the target position data at least comprises the coordinates of the stay position of the cursor in the page.
And step 404, generating a user identification for identifying the user according to the coordinates obtained by the target position data.
Because the target position data at least comprises the coordinates of the stay positions of the cursor in the page, the execution program can obtain the distribution situation of the stay positions of the cursor by analyzing the coordinates of the stay positions, namely the coordinate distribution of each stay point of the cursor in the page, and the user identification is obtained by converting the coordinate distribution.
According to the user identifier generation method provided by the embodiment of the application, whether the currently displayed page is the content type typesetting page is judged, when the page is the content type typesetting page, the stay position data of the cursor in the page is obtained, the target position data is obtained, and the user identifier is generated according to the coordinates obtained by the target position data.
In one embodiment, referring to fig. 6, step 401 further comprises:
step 601, judging whether the access address of the page is matched with a preset page address;
step 602, if the page is matched, the page is a content typesetting page;
and step 603, if the page is not matched, the page is a non-content typesetting page.
In an embodiment, the executing program judges whether the access address of the page is matched with a preset page address to judge whether the page is a content type typesetting page, if so, the page is a content type typesetting page, and if not, the page is a non-content type typesetting page. Specifically, a preset list library may be established, where page addresses of known content typesetting pages are stored in the list library. The preset page address may be stored on the server or may be stored locally. By the method, whether the page is a content typesetting page can be quickly and intuitively judged.
In an embodiment, a maintainer or a background server executing a program inputs a content typesetting page according to the existing experience, the input accuracy is higher through a manual judgment mode, and the method can be used for setting a plurality of common pages, so that the realization efficiency is high.
In another embodiment, the list library may be updated by a system automatic identification manner, for example, a background server of an application program automatically captures pages and analyzes typesetting of the pages, and updates an access address corresponding to the content type typesetting pages into the list library.
In an embodiment, besides the page address, it may also be determined whether the current page is a content typeset page by combining a title, a brief introduction, or a keyword.
In another embodiment, referring to fig. 7, step 401 further comprises:
step 701, identifying a content display area in the page;
in step 702, if the content display area is located in the middle of the page and a blank area with a width that is greater than a second threshold value in the width ratio of the width of the page exists between two sides of the content display area along the width direction and the edge of the page, the page is a content typesetting page.
Taking the page as a web page for illustration, referring to fig. 5, the content display area 501 in the page is first identified, elements such as characters, pictures or frames of the page and their positions can be first identified, and the densely distributed area of the content such as the characters or the pictures is identified as the content display area 501. And combining the outer margin and the inner margin to obtain the position and the width of the white-keeping area 502, so as to judge whether the page is a content typesetting page.
In an embodiment, whether the page is a content type typesetting page is determined, specifically, a content display area in the page may be identified, and if the content display area is located in the middle of the page, and a white area with a width of more than a second threshold value is located between two sides of the content display area along the width direction and an edge of the page, the page is the content type typesetting page. Illustratively, the second threshold is 20% to 30%, and it is understood that the above second threshold is merely exemplary, and that the actual value may take other suitable ranges. By the method, whether the page is the content typesetting page can be accurately and stably judged.
In addition, in an embodiment, it may be further determined whether the page is a content typeset page by combining the above two determination methods, referring to fig. 8, step 401 further includes:
step 801, judging whether the access address of the page is matched with a preset page address;
step 802, if the page is matched, the page is a content typesetting page;
step 803, if not, identifying a content display area in the page;
in step 804, if the content display area is located in the middle of the page and a blank area with a width that is greater than a second threshold value is present between two sides of the content display area along the width direction and the edge of the page, the page is a content typesetting page.
When judging whether a page is a content typesetting page, firstly adopting preset page addresses to judge, if the preset page addresses are not matched with the preset page addresses, then combining a content display area and a blank area to judge, so that whether the page is the content typesetting page can be quickly and intuitively judged, and the accuracy and the stability of judgment can be improved.
In an embodiment, each time a content type typesetting page is determined in the above manner, the content type typesetting page is added into the list bank, so as to enlarge the number of preset addresses in the list bank, avoid repeated matching during each determination, and improve the determination efficiency.
For example, when the content-type typesetting page is a web page, the website of the page may be stored in the list bank, when judging whether a page is a content-type typesetting page, the website of the page may be obtained first, then whether a preset address matches with the preset address is searched in the list bank, and if yes, the page is a content-type typesetting page.
When a user browses a page by using a browsing program, a series of position data are generated along with the movement of a cursor, and the series of position data can form a track of the movement of the cursor. The dwell time of the cursor at some position data, such as the dwell position data of the cursor, for example, fig. 9 is a schematic diagram of a track formed when the cursor moves, and points a, B, C, etc. are dwell position data of the cursor. The coordinates of the corresponding stay position data may be obtained by establishing a coordinate system using, for example, the upper left corner of the window of the page as the origin. Illustratively, the longer the dwell time of the dwell position data, the larger the radius of the corresponding point may also be set, making the properties of the dwell position data more apparent.
Based on this, and as shown with reference to fig. 10, in one embodiment, step 402 includes step 1001 and step 1002.
Step 1001: collecting coordinates of a cursor in a page;
illustratively, when the coordinates are collected, the page may be a web page, the page may be opened by using a browser, and the action of collecting the coordinates may be completed by an executing program of the browser; or the page can be opened by using a browsing window embedded in other software, such as instant messaging software with a browsing window, document processing software and the like, and the action of collecting coordinates can be completed by an execution program of the software with the browsing window. Alternatively, the page may be a document processing page, such as a WORD document, PDF document, etc., which may be opened by the document processing software, at which time the act of gathering coordinates may be performed by an executing program of the document processing software.
In one embodiment, the collected coordinates may be coordinates of a cursor in the same page. In another embodiment, the acquired coordinates may be coordinates of a cursor in a different page, i.e. the user does not stop or reset the acquisition process when switching the browsing page.
Step 1002: and if the stay time of the cursor at the coordinate is longer than a first threshold value, taking the position data corresponding to the coordinate as stay position data of the cursor.
The value of the first threshold may be determined according to actual requirements, and for example, the first threshold may take 1 second, that is, the time of the cursor staying at a certain coordinate is longer than 1 second, and the position data corresponding to the coordinate is the stay position data of the cursor.
The stay position data of the cursor in the page can be simply and conveniently obtained by collecting the coordinate of the cursor in the page and judging the stay time length of the coordinate.
In an embodiment, the dwell time may be determined by the time difference between two adjacent coordinates, so, referring to fig. 11, the acquiring dwell position data of the cursor in the page in step 402 includes, but is not limited to, the following steps:
step 1101: responding to the movement of the cursor, and collecting a plurality of continuous coordinates and corresponding time points in the moving process of the cursor in the page;
for example, when a user browses a web page, a native event mechanism of a browser, such as a Mouse Move event, is used, whether a cursor is moved is judged by using whether the Mouse Move event is triggered, and after the cursor is moved, coordinates and a time point of the current cursor are recorded, wherein the movement of the cursor takes a pixel as a unit, and the Mouse Move event is triggered once every pixel when the cursor moves, and the native event mechanism can be executed by an execution program of the browser; the continuous collection of cursor coordinates may also be accomplished using an executing program of the document processing software when the user uses other applications, such as document processing software. The collection of cursor coordinates and time points may be implemented using JS (JavaScript) script, for example, or may be implemented using other means, such as VB (Visual Basic) language, for example.
Step 1102: if the time difference between two adjacent coordinates is greater than a first threshold value, the position data corresponding to the previous coordinate in the two adjacent coordinates is used as the stay position data of the cursor.
Specifically, the time difference between two adjacent coordinates is the residence time of the preceding coordinate, which may be obtained by subtracting the time point corresponding to the preceding coordinate from the time point corresponding to the following coordinate, for example, the time point of the preceding coordinate is 12:00:00, and the time point of the following coordinate is 12:00:30, and then the time difference between the two adjacent coordinates is 30 seconds. In addition, the value of the first threshold may be determined according to actual requirements, and for example, the first threshold may take 1 second, that is, the time that the cursor stays at a certain coordinate is longer than 1 second, and the position data corresponding to the certain coordinate is the stay position data of the cursor.
In addition to subtracting the time point corresponding to the previous coordinate from the time point corresponding to the subsequent coordinate, so as to obtain the residence time of the previous coordinate, in an embodiment, the acquisition period of the coordinate may also be set, and the residence time of the coordinate may be obtained by accumulating the times of continuously acquiring the same coordinate and multiplying the times by the acquisition period.
In an embodiment, by collecting a plurality of continuous coordinates and corresponding time points in the moving process of the cursor in the page, obtaining the time difference between two adjacent coordinates according to the time points, the residence time of the position data corresponding to the previous coordinates can be obtained, and judging the residence time of the coordinates, the residence position data of the cursor in the page can be simply and conveniently obtained.
In addition, as the cursor continuously moves, the number of coordinates and time points finally acquired may be too large, and the requirement on the computing capacity of the terminal may be increased. Therefore, in an embodiment, in response to movement of the cursor, a plurality of coordinates and corresponding time points in the process of moving the cursor in the page may be collected according to a preset collection rate.
Specifically, the collection rate is the number of times of collecting the coordinates of the cursor and the time point in a unit time. A unit time is preset, and when the number of times of collecting the cursor coordinates in the unit time reaches an upper limit, the collection of the cursor coordinates is stopped. For example, the function throttling may be implemented in a manner of function throttling, for example, in a JS script, where a unit time is set to 1 second (i.e., 1000 milliseconds), when the wait in the function button takes 50 milliseconds, that is, the fn function 20 (1000/50=20) is executed at most in 1000 milliseconds, so that the limit on the number of times the fn function is executed in a unit time is implemented, and the fn function may be a function for implementing cursor coordinate acquisition. It can be understood that the values of the parameters of the unit time and the function throw can be adjusted according to the actual situation. In addition, the coordinate acquisition speed may be limited by setting the length of the coordinate acquisition period in advance. The coordinates of the cursor are continuously collected according to the preset track collection rate, so that the collection performance can be optimized, and the collection stability of the coordinates of the cursor is ensured.
In addition, different window sizes of the pages, for example, different window sizes of the pages caused by different screen specifications, can enable different coordinates to be acquired at the same position of the same page. Thus, in one embodiment, the coordinates collected refer to a proportional value of the position of the cursor relative to the window size of the page.
Specifically, the acquired coordinates are normalized, and the coordinates are converted according to the window width and the window height of the page. For example, a certain coordinate acquired is (a, b), the window width of the page is ScreenWidth, the window height of the page is ScreenHeight, and the coordinate obtained by converting the coordinate is (a/ScreenWidth, b/ScreenHeight). The acquired coordinates are converted according to the window width and the window height of the page, so that the influence of the window specification of the page on the acquired coordinates can be reduced, and the accuracy of the acquired coordinates is improved.
In one embodiment, referring to fig. 12, step 403 further comprises:
step 1201, taking the acquired dwell position data of the cursor as first position data.
Executing a program to identify and acquire the stay position data of the cursor in the page, which is equivalent to screening the stay position data from the moving data of the cursor as first position data, wherein the first position data comprises position data corresponding to all stay positions of the cursor in the page, and can be understood as that the first position data comprises all stop points of the cursor in the content typesetting page.
Step 1202, obtaining target position data according to the first position data.
The executing program further needs to process the first position data in the page to obtain target position data, which can be understood as extracting position data corresponding to all cursor stay points in the page and performing data processing to obtain target position data, where the target position data at least includes coordinates of stay positions of the cursor in the page.
In another embodiment, referring to fig. 13, step 403 further comprises:
and step 1301, taking the acquired stay position data of the cursor as first position data.
Executing a program to identify and acquire the stay position data of the cursor in the page, which is equivalent to screening the stay position data from the moving data of the cursor as first position data, wherein the first position data comprises position data corresponding to all stay positions of the cursor in the page, and can be understood as that the first position data comprises all stop points of the cursor in the content typesetting page.
Step 1302, removing the second position data from the first position data to obtain target position data.
The execution program further needs to process the first position data in the page to obtain target position data, wherein the second position data is the stay position of the cursor which does not accord with the behavior habit of the user, and the coordinate distribution of the cursor in the target position data can accord with the behavior habit of the user better by eliminating the second position data from the first position data, so that the identification capability of the user identification is improved.
In one embodiment, the second location data is at least one of: (1) The dwell position data of the corresponding cursor in the content display area of the page; (2) The stay position data of the cursor corresponding to the content display partition in the content display area of the page; (3) The stay position data of the cursor corresponding to the area where the interactive element of the page is located; (4) The stay position data of the cursor corresponding to the interference area with the distribution density smaller than a third threshold value; (5) The dwell position data of the cursor with the longest dwell time; (6) And sequencing according to the order of the stay time of the cursor on the page from long to short, and ranking the stay position data after the fourth threshold value.
For the dwell position data of the corresponding cursor in the content display area of the page, referring to fig. 5, when the user browses the content type page, it is possible to dwell the cursor in the content display area 501, for example, select, copy, etc. the reading content of interest. In this case, the rest position of the cursor is guided by the content in the content display area, and in order to improve the accuracy of identifying the user identifier, the rest position data of the cursor corresponding to the content display area 501 is removed from the first position data as the second position data. Wherein the content area function of the current page can be constructed, for example, the circular equation is satisfied: (x-a)/(2+ (y-b)/(2=rζ2) determining whether a stay position of a cursor has guiding property, and determining that (x-a)/(2+ (y-b)/(2 < =rζ2) is the stay point having guiding property according to the equation. Wherein (a, b) is the center of the content area, and r is the radius of the content area. Likewise, rectangular content areas may be constructed.
For the stay position data of the cursor corresponding to a content display section in a content display area of a page, the content display area may include a plurality of content display sections, for example, as shown in fig. 14, a left column 1401 and a right column 1402 are included in the content display area 501, the left column 1401 and the right column 1402 both belong to the content display section in the content display area 501, wherein the left column 1401 is used for displaying text content, the right column 1402 is used for displaying navigation links, first, a user may stay the cursor in the left column 1401 under the influence of the text content, for example, an operation of selecting and copying using the cursor is performed, and in addition, the user may click on the navigation connection under the influence of the right column 1402, so in this embodiment, the stay position data of the cursor corresponding to the content display section in the content display area 501 of the page is removed from the first position data as second position data. For example, in fig. 14, a content area function of the left and right columns is constructed, and the stay position data of the cursor falling in the function is removed from the first position data as the second position data.
For the dwell position data of the cursor corresponding to the area where the interactive elements of the page are located, as shown in fig. 15, the content display area 501 includes, in addition to the text 1501, multiple interactive elements, such as a picture 1502, a button 1503, a form 1504, and the like, where the interactive elements have strong guiding performance, in order to make the target position data more accurately reflect the behavior habit of the user for browsing the content typesetting page, by constructing the content area functions of the interactive elements, dwell position data of the cursor falling into the functions is removed from the first position data as the second position data.
For the dwell position data of the cursor corresponding to the interference area with the distribution density smaller than the third threshold, since the movement and dwell of the mouse in the non-content display area are unconscious when the user browses the page, the position data of the cursor in the part are subconscious habitual performances of the user in browsing the page, and habitual actions and behaviors are reproducible and regular in the same mode. The areas with a smaller distribution density therefore belong to the interfering stay position data. Therefore, the dwell position of the cursor of the first position data is subjected to data processing, a region with the density smaller than a third threshold value is divided in the page, a content region function corresponding to the region is determined, and the dwell position data of the cursor falling into the function is taken as the second position data to be removed from the first position data. For example, the cursor rest position in the left area is distributed in a smaller density, and the rest positions are removed, as shown in the schematic diagram of the track formed when the cursor moves in fig. 9.
For the dwell position data of the cursor with the longest dwell time, a point with the longest dwell time may be generated due to an error, for example, the dwell time of the cursor is too long because the user leaves the terminal, and the position of the point cannot reflect the operation behavior habit of the user, so the dwell position data of the cursor with the longest dwell time is removed from the first position data as two position data.
And for the stay position data which are ranked after the fourth threshold value according to the order of the stay time of the cursor on the page from long to short, the stay position data which are ranked after the fourth threshold value can be used as second position data to be removed from the first position data by ordering the stay time of the cursor from long to short. The set value of the fourth threshold may be set according to practical situations, and the fourth threshold in this embodiment is 200.
It should be noted that, in the embodiment of the present application, the second position data may be one or more combinations of the above-mentioned stay position data that may be removed, for example, the stay position data of the cursor corresponding to the area where the interactive element of the page is located and the stay position data of the cursor corresponding to the interference area with the distribution density smaller than the third threshold may be all removed from the first position data as the second position data.
In one embodiment, referring to FIG. 16, step 404 further comprises:
and step 1601, obtaining centroids of the plurality of target position data according to coordinates of the target position data.
The target position data can be distributed in a cluster, in the cluster formed by the target position data, the position with the maximum density is the centroid, and the execution program can obtain the coordinate corresponding to the centroid according to the coordinate corresponding to the acquired target position data.
In one embodiment, the centroids of the plurality of target location data may be obtained by a mean shift algorithm. Specifically, a reference position is randomly selected from a cluster formed by target position data, a reference circle is obtained by taking the reference position as a circle center, and a plurality of target position data fall into the reference circle; and then taking the reference position as a starting point, taking the position of the target position data falling in the reference circle as an end point, obtaining a plurality of reference vectors, adding the plurality of reference vectors to obtain a mean shift vector, taking the end point of the mean shift vector as the next reference position, and iterating the process until the reference position when the size of the mean shift vector is converged is the mass center of the plurality of target position data.
In one embodiment, the number of clusters formed by the target position data may be plural, each cluster may have a centroid, the resulting centroids may be plural, and the centroids of the clusters having a greater density may be selected as the final centroids. In the process of iteratively obtaining the corresponding centroid of each cluster, the sum of the numbers of the reference vectors involved in obtaining the mean shift vector can be recorded, and on the premise that the reference position of each cluster and the selection standard of the reference circle are the same, the larger the sum of the numbers of the reference vectors is, the larger the density of the cluster is proved, and finally, the centroid of the cluster with the larger sum of the numbers of the reference vectors is taken as the final centroid.
In an embodiment, in the process of obtaining the mean shift vector by using the reference vector, a weight coefficient may be further added to the reference vector, so as to improve accuracy and rationality of the final obtained centroid. For example, the longer the stay time length of the target position data is, the larger the weight of the corresponding reference vector in obtaining the mean shift vector can be according to the stay time length of the target position data as the standard of the weight coefficient.
Step 1602, generating a user identification for user identification based on the coordinates of the centroid.
The user identification is generated according to the coordinates of the centroid, the user identification can be obtained through calculation by a certain algorithm by utilizing the coordinates of the centroid, the user identification can be obtained through calculation by using a hash algorithm, and the calculated user identification can be one or a combination of a plurality of numbers and characters. In an embodiment, the coordinates of the centroid may also be directly used as the user identification. By obtaining the centroids of the target stay points and generating the user identification according to the coordinates of the centroids, the finally generated user identification can more reasonably and accurately reflect the behavior habit of the corresponding user.
Illustratively, in order to intuitively illustrate the method of centroid calculation, it is to be noted that, in conjunction with fig. 17, the points in fig. 17 represent coordinates in the target position data, and are not actually displayed contents in the page.
Referring to fig. 17, point X1 is a reference position, R1 is a reference circle, the reference position X1 is a start point, the position where the target position data falling within the reference circle R1 is located is an end point, a plurality of reference vectors B1 are obtained, the plurality of reference vectors B1 are added to obtain a mean shift vector M1, the end point of the mean shift vector M1 is a next reference position X2, a reference circle R2 is obtained, the above process is repeated until the size of the n-th mean shift vector Mn converges, at this time, the end point of Mn (i.e., the next reference position M n+1 ) I.e. the centroid. For example, when M1 is obtained by using B1, a corresponding weight coefficient may be given to B1 according to the residence time length of the target position data corresponding to B1, where the longer the residence time length of the target position data is, the larger the corresponding weight coefficient of B1 is. For example, the corresponding reference vectors B1 are divided into three groups according to the order of the stay time periods, and the weight coefficient when the corresponding B1 is added may take 1, 2 or 3.
In one embodiment, referring to fig. 18, on the basis of steps 401 to 404, the method further includes the following steps:
step 1801, sending authentication information containing the user identifier to a server.
After the executing program collects enough stay position data samples, the user identification corresponding to the current sample can be obtained through calculation, the executing program sends verification information to the server, the verification information comprises the user identification generated by any one of the embodiments, and whether the current user identification is matched with a specific user or not is verified by sending the verification information to the server, so that the user identification is used as a basis for whether to continuously obtain stay position data of a cursor in the page or not.
Step 1802, obtaining verification feedback information from the server, where the verification feedback information is used to feed back whether the user identifier has a matching user in the server.
And the execution program acquires verification feedback information from the server, wherein the verification feedback information is used for feeding back whether the user identifier is matched with the user in the server. In comparison, the server compares the obtained user identifier in a user identifier library of the server, in one embodiment, the user identifier is the coordinate of the centroid of the target position data, or the user identifier is restored to the coordinate of the centroid by using an algorithm (such as a hash algorithm) for generating the user identifier, when matching comparison is performed, certain fuzzy processing is performed on the user identifier, for example, the coordinate of the centroid is (Xp, yp), offset (xp±0.03, yp±0.03) is added to the coordinate of the centroid, at this time, the user identifier is converted into a range, and then coordinate data falling into the range is searched in the user identifier library, so that the current user is identified.
When there is a matching user, the current user identification is indicated to be valid, when there is no matching user, the current user identification is indicated to be invalid, possibly if the current user is a user that has not been identified before, or the acquired stay position data sample is insufficient to reflect the behavior habit of the user.
Based on this, as shown in fig. 19 in one embodiment, on the basis of steps 1801 to 1802, a step of at least one of:
Step 1901, when the verification feedback information indicates that there is a matching user, sending the user identification to the server.
If the verification feedback information indicates that the matching user exists, the current user identification is effective, and the user identification is sent to a server, wherein the verified server and the server for sending the user identification can be different servers, for example, the server comprises a verification server and a target server, the target server is a developer of a current execution program, the verification server identifies a resource library for users shared by different developers, the execution programs on different terminals are verified through the verification server, and after the verification is successful, the generated user identification is sent to the target server. In another embodiment, the verification server and the target server are the same server, and when the verification feedback information indicates that there is a matched user, the execution program stops collecting the stay position data of the cursor, and the user identification is not required to be sent again.
And 1902, when the verification feedback information indicates that no matched user exists, continuing to acquire the stay position data and regenerating the user identification until the matched user is successfully matched or a preset matching condition is exceeded.
When the verification feedback information indicates that no matching user exists, the acquired target position data is likely to not reflect the behavior habit of the user, and then the stay position data of the cursor is continuously acquired and the user identification is regenerated. Another way is to collect new cursor dwell position data in a sample-wise manner in combination with already collected dwell position sample data, for example, in one embodiment, see fig. 20, the step 1902 includes the steps of:
step 2001, the existing acquired dwell position data is retained as a first dwell position data set.
The executive saves the dwell position data of the cursor that has been currently acquired as a first dwell position data set, which may be a call to the object position data that has been collected.
Step 2002, continuing to acquire new dwell position data to generate a second dwell position data set.
The execution program again executes steps 101 to 103 of the above embodiment, generating a second dwell position data set, wherein the second dwell position data set comprises dwell position data of the newly acquired cursor.
Step 2003, regenerating the user identification according to the first position stay data set and the second position stay data set.
The executing program generates new target stay position data according to the first stay position data set and the second stay position data set, wherein the filtering step of step 1302 of the above embodiment may be executed again to obtain new target stay position data, new user identification is regenerated, verification information including the new user identification is re-sent to the server, and if the received verification feedback information still indicates that there is no matching user, steps 2001 to 2003 are repeatedly executed until the matching user is successfully matched or a preset matching condition is exceeded.
In an embodiment, the preset matching condition includes at least one of:
first preset matching conditions: and presetting the matching times, after receiving verification feedback information from the server each time, indicating that one-time matching is completed, and stopping continuously acquiring the stay position data of the cursor in the page if the preset matching times are exceeded. In this embodiment, the number of matching times set is 3, that is, when the verification feedback information received from the server reaches 3 times and still cannot be matched, the stopping of the acquisition of the stay position data of the cursor in the page is stopped, and it is noted that the preset number of matching times can be adjusted as required.
Second preset matching conditions: the preset matching time, that is, the execution time of the user identifier generating method, is equal to the running time of the executing program, if the running time of the executing program exceeds the preset matching time and still fails to match the user, stopping continuously acquiring the stay position data of the cursor in the page, and in this embodiment, the preset matching time is set to 15 minutes, that is, when the execution time of the user identifier generating method exceeds 15 minutes and still fails to match the user, stopping acquiring the stay position data of the cursor in the page.
The first preset matching condition and the second preset matching condition can be used according to selection of one of the preset matching conditions, for example, only the first preset matching condition or the second preset matching condition is adopted to judge whether to stop acquiring the stay position data of the cursor in the page.
In addition, two preset matching conditions may be set simultaneously, for example, if the first preset matching condition or the second preset matching condition is met, the acquisition of the stay position data of the cursor in the page is stopped. Or the first preset matching condition or the second preset matching condition is required to be met at the same time, and the acquisition of the stay position data of the cursor in the page is stopped.
By setting the preset matching condition, excessive consumption of terminal computing resources can be avoided.
In one embodiment, when the preset matching condition is exceeded, the currently generated user identifier is recorded as a new user identifier corresponding to the current user in the user identifier library of the server. When the current user browses the content typesetting page in any browsing program of any terminal, the user identification corresponding to the user can be generated, so that the server can identify the user through the user identification.
In any of the above embodiments, the execution program may be a built-in function module of a browsing program, for example, a built-in software function module of desktop software such as a browser, instant chat software, document processing software, etc., that is, these browsing programs already have the above execution program when installed, and when the user uses the browsing program, the execution program executes the user identifier generating method of any of the above embodiments.
The execution program may also be a browser plug-in, such as a browser plug-in that the user actively installs in the browser or a plug-in that the user automatically installs to the browser when browsing a web page. For the former, the browser plug-in is published by the author of the browser, for example in a plug-in library website for the user to download and install at his own discretion. The latter is provided by the web server, that is, when the user accesses the link specified by the web server, the web server sends the browsing program plug-in to the browsing program of the terminal, and the browsing program plug-in is loaded and installed by the browsing program, then the browsing program plug-in can execute the user identifier generating method of any one of the above embodiments, and can feed back the user identifier to the web server through the browsing program plug-in, so that the web server can identify the user through the user identifier, even if the user changes different browsing programs or even different terminals, the user can be tracked and identified as long as logging in the web server specific web page, and the user identification across the browser is realized.
The execution program may also be a separate software from the browsing program, such as an application running in the background of the operating system, such as security management software, input method software, etc. At this time, the execution program runs in the background, and after the user opens the browse window, the execution program in the background executes the user identifier generating method in any one of the above embodiments. Even if a user changes a browsing program or a terminal, the user can be tracked and identified as long as the terminal is provided with the executing program, so that the identification of the user across browsers is realized.
The user identifier generating method provided by the embodiment of the application can be applied to an application environment shown in fig. 3, and the application environment comprises: a terminal 11, a server 12, and a communication network 13, wherein the terminal 11 and the server 12 are in communication connection with each other through the communication network 13. Wherein the server 12 may take the form of a single or cluster, i.e., the number may be one or more, the server 12 may be set by a web page provider, a browser provider, or an advertisement provider, for example.
Fig. 21 is a flowchart of a method for identifying a user, which is provided in an exemplary embodiment of the present application, and the method is applied to a server, and specifically includes step 2101, step 2102, step 2103 and step 2104.
Step 2101, obtaining a user identifier, wherein the user identifier is generated by the user identifier generation method in any one embodiment;
step 2102, matching the user identifier in a preset user identifier feature library; the method comprises the steps of,
further comprising at least one of the following steps:
step 2103, completing the identification of the user when the matching is successful;
and 2104, when the matching is failed, storing the user identification as a new user identification characteristic into the user identification characteristic library.
In step 2101, the server obtains a user identifier from the terminal through the communication network, wherein the user identifier is generated based on stay position data of a cursor in a page, acquired by a user browsing a content type page, wherein the stay position data is position data of the cursor when staying in the page, and the stay position data is longer than a first threshold value. In this embodiment, the first threshold is 1 second, that is, the stay time of the cursor at a certain coordinate is longer than 1 second, and the position data corresponding to the certain coordinate is the stay position data of the cursor. And the terminal obtains target position data according to the stay position data of the cursor in the page, and generates a user identifier through the target position data. The coordinates corresponding to the target position data reflect the behavior habit of the user for browsing the content typesetting webpage, so that the user identification can be effectively identified and used by the end user.
The user identifier is the coordinates of the mass center of the distribution position of the cursor in the content typesetting page or a character string generated by the coordinates of the mass center according to a certain algorithm (such as a hash algorithm). In this embodiment, the center of mass of the plurality of target position data is obtained by using a mean shift algorithm for the positions of the pairs of optical fibers distributed in the content typesetting page, and the coordinates of the center of mass may be expressed as (Xp, yp), and the coordinates of the center of mass are used as the user identifier, or the user identifier is generated by using the coordinates of the center of mass. In one embodiment, the coordinates of the centroid are identified by absolute values, that is Xp is the width value of the centroid in the page window, yp is the height value of the centroid in the page window, and the sizes of the page windows of different devices are different, so in another embodiment, the coordinates of the centroid are normalized, and the coordinates are converted according to the window width and the window height of the page. Illustratively, a certain coordinate acquired is (Xa, ya), the window width of the page is ScreenWidth, the window height of the page is ScreenHeight, and the corresponding user identification value is (Xa/ScreenWidth, ya/ScreenHeight) for the coordinate conversion, that is, xp=xa/ScreenWidth, yp=ya/ScreenHeight. The coordinates of the mass center are converted according to the window width and the window height of the page, so that the influence of the window specification of the page on the coordinates of the mass center can be reduced, and the accuracy of identification is improved.
In step 2102, the server compares the obtained user identifier with the user identifier library of the server, in an embodiment, the user identifier is the coordinate of the centroid of the target position data, when matching comparison is performed, an offset is added based on the centroid coordinate, so as to obtain a range value of the coordinate, for example, when the offset is 0.03, the range value is (xp±0.03, yp±0.03), and at this time, the coordinate corresponding to the stored user identifier in the user identifier library is compared with the range value, and the coordinate falling into the range value is the matched user identifier.
If the matching is successful, the user identification is completed, and if the matching is failed, the user identification which is not successfully matched can be updated into a user identification library, so that the user identification is newly built.
According to the user identification method, the user identification is obtained by obtaining the user identification, the user identification is based on target position data obtained according to the stay position data of the cursor in the page, and then the user identification is generated according to the target position data. Because the coordinates corresponding to the target position data reflect the behavior habit of the user for browsing the content typesetting webpage, the obtained user identification can be directly related to the behavior habit of the user and does not depend on static information of a browser, even if the user changes a browsing program or changes computer equipment, the operation habit of the user for browsing the content typesetting webpage is consistent, and therefore the user identifications generated by different browsing programs and even different computer equipment are consistent or related, and the accuracy of user identification can be improved through the user identifications.
Fig. 22 is a flowchart of a method for generating a user identifier according to another exemplary embodiment of the present application, including steps 2201 to 2211, where the method for generating a user identifier according to the present embodiment is applied to a terminal, and specific steps are as follows:
step 2201, judging whether the current access address of the page is matched with the preset page address, if so, executing step 2203; if there is no match, go to step 2202.
And running a browsing program to access the page, and comparing and matching the current access address of the page with a preset page address in a list library, wherein the preset address in the list library is a known content typesetting page, and the preset page address in the list library can be acquired from a server or stored locally at the terminal. For example, a web page is accessed by a browser, or a web page is accessed by searching or hyperlinking, and the web address of the web page is obtained for matching.
Step 2202, identifying a content display area in a page, judging whether the page is a content typesetting page, if so, executing step 2203; if not, ending.
And identifying whether a content display area in the page is positioned in the middle of the page, wherein a white area with the width occupying the width of the page being larger than a second threshold value exists between two sides of the content display area along the width direction and the edge of the page, and the page is a content typesetting page. In this embodiment, the value of the second threshold is 20% to 30%.
In step 2203, coordinates of the cursor in the page are collected.
In step 2204, the dwell position data of the cursor in the page is obtained as the first position data.
If the stay time length of the cursor at the coordinate is greater than the first threshold, the position data corresponding to the coordinate is used as the stay position data of the cursor, the value of the first threshold may be determined according to actual requirements, and in an exemplary embodiment, the first threshold may take 1 second, that is, the stay time length of the cursor at a certain coordinate is greater than 1 second, and the position data corresponding to the coordinate is the stay position data of the cursor. And responding to the movement of the cursor, acquiring a plurality of continuous coordinates and corresponding time points in the moving process of the cursor in the page, and taking the position data corresponding to the previous coordinate in the two adjacent coordinates as the stay position data of the cursor if the time difference between the two adjacent coordinates is larger than a first threshold value.
Step 2205, removing the second position data from the first position data to obtain target position data.
Wherein the second location data comprises at least one of:
the dwell position data of the corresponding cursor in the content display area of the page;
the stay position data of the cursor corresponding to the content display partition in the content display area of the page;
The stay position data of the cursor corresponding to the interference area with the distribution density smaller than a third threshold value;
the dwell position data of the cursor with the longest dwell time;
and ranking the stay position data after a fourth threshold according to the order of the stay time of the cursor on the page from long to short, wherein the fourth threshold in the embodiment is 200.
Step 2206 is executed to determine whether the sample collected in the target position data reaches a fifth threshold, if yes, step 2207 is executed, and if not, step 2201 is returned.
The samples collected in the target position data are the stay position data of the cursor conforming to the screening rule in step 2205, where the fifth threshold may be set as required, and in this embodiment, the fifth threshold is a dynamic change value, and the initial number is 100.
Step 2207 obtains centroids of the plurality of target location data from the target location data.
Since the target position data corresponds to the distribution of cursor rest positions on the page, the centroid of these coordinate positions can be found by means of a mean shift algorithm.
Step 2208, generating a user identification for user identification according to the coordinates of the centroid.
The user identifier is generated by the coordinates of the centroid, or the coordinates of the centroid are directly used as the user identifier, where the coordinates of the centroid may be expressed as (Xp, yp), in this embodiment, the coordinates of the centroid are the relative proportion positions of the centroid in the page window, and an acquired coordinate of a centroid is (Xa, ya), the window width of the page is ScreenWidth, the window height of the page is ScreenHeight, and the corresponding user identifier value is (Xa/ScreenWidth, ya/ScreenHeight) after the coordinate conversion, that is xp=xa/ScreenWidth, yp=ya/ScreenHeight.
Step 2209, sending the user identifier to a server for verification, if the verification is successful, ending, and if the verification is failed, executing step 2210.
The server compares the acquired user identification with a user identification library in the server, if the user identification is successful, the server feeds back verification feedback information to the terminal, and informs the terminal of the identification result.
Step 2210, determining whether the preset matching condition is exceeded, if yes, executing step 2211, if not, increasing the value of the fifth threshold, and returning to step 2201.
Step 2211, marking the current generated user identification as a new user identification and sending the new user identification to the server.
The preset matching conditions include preset matching times and preset matching time, and if any matching condition is exceeded, the preset matching conditions are judged to be exceeded, the preset matching times in the embodiment are 3 times, and the preset matching time is 15 minutes. If the preset matching condition is not exceeded, the number of samples collected in step 2206 is increased by adding the data of the fifth threshold, so as to improve the accuracy of the user identifier generated next time. And the terminal sends the user mark marked as the newly added user mark to a server, and the server adds the user mark to a user mark library. When the duration of browsing the page by the user is long, the number of generated user identifiers may be multiple, and in an embodiment, the user identifier with the largest occurrence number when the user stops browsing the page (for example, closing the browser) may be taken to correspond to the user currently browsing, or the user identifier generated last time when the user stops browsing the page may be taken to correspond to the user currently browsing.
In summary, in the user identifier generating method provided in this embodiment, by determining whether the currently displayed page is a content-type typesetting page, when the page is a content-type typesetting page, the stay position data of the cursor in the page is obtained and the target position data is obtained, and the user identifier is generated according to the coordinates obtained by the target position data.
In the above embodiment, the user identifier is generated by the terminal, in an embodiment, the terminal may also be generated by the server, for example, the terminal determines whether the currently displayed page is a content type typesetting page, when the page is a content type typesetting page, acquires the stay position data of the cursor in the page, obtains the target position data according to the stay position data of the cursor in the page, then sends the target position data to the server, and the server generates the user identifier for identifying the user according to the coordinates obtained by the target position data. In another embodiment, the terminal is responsible for judging whether the currently displayed page is a content typesetting page or not and acquiring the stay position data of the cursor in the page, then the stay position data is sent to the client, and the client performs screening data processing to obtain target position data, so as to generate a user identifier for identifying the user. The detailed steps of the steps are merely that the execution bodies are different, and the specific execution details are consistent.
As an example, for explaining the user identifier generating method and the user identifying method according to the embodiments of the present application, the following description is made for different application scenarios:
The first scene is applied to a scene of using a browser by a user, wherein an execution program for executing the user identification generation method is arranged in the browser, and the specific process is as follows:
firstly, in the process that a user browses a webpage by using a browser, the browser judges whether the webpage currently browsed by the user is a content type typesetting webpage, if the webpage currently browsed by the user is the content type typesetting webpage, the stay position data generated in the process of cursor movement are obtained, then target position data are screened out from the stay position data, the mass center of the target position data is calculated, finally, the browser generates a user identification corresponding to the user according to the mass center, and the user identification is sent to a server. The server uses the user identification to identify the user, namely, whether the same user identification exists or not is searched from a user identification library; if the identification is unsuccessful, the user identification corresponding to the user identification is used as a new user, and the corresponding user identification is stored in a user identification library for the subsequent identification of the user.
The second scenario is a scenario applied to a user using a browser, and is different from the last scenario, where the browser is a common browser and has no execution program for executing the user identifier generating method in the above embodiment, and the specific process is as follows:
When a user uses a browser to browse a webpage, the webpage provided by a webpage server is accessed, when the webpage is accessed, the webpage prompts to install a browser plug-in, if the current browser setting is an automatic installation plug-in, the plug-in is automatically installed, if the browser setting is blocked, the user is prompted to click on the installation, after the browser plug-in is installed, the browser plug-in judges whether the webpage currently browsed by the user is a content typesetting page, if the webpage currently browsed by the user is a content typesetting page, the browser plug-in acquires stay position data generated in the cursor moving process, then screens out target position data from the stay position data, calculates the mass center of the target position data, finally, generates a user identifier corresponding to the user according to the mass center, and sends the user identifier to the server. The server uses the user identification to identify the user, namely, whether the same user identification exists or not is searched from a user identification library; if the identification is unsuccessful, the user identification corresponding to the user identification is used as a new user, and the corresponding user identification is stored in a user identification library for the subsequent identification of the user.
Scene three, software applied to a user using a non-traditional browser, illustratively referring to fig. 23, the user browses web pages using a browsing window embedded in chat software. At this time, the chat software can judge whether the webpage currently browsed by the user is a content typesetting page, if the webpage currently browsed by the user is a content typesetting page, the stay position data generated in the cursor moving process are obtained, then the target position data are screened out from the stay position data, the mass centers of the target position data are calculated, and finally the chat software generates a user identification corresponding to the user according to the mass centers and sends the user identification to the server. The server uses the user identification to identify the user, namely, whether the same user identification exists or not is searched from a user identification library; if the identification is unsuccessful, the user identification corresponding to the user identification is used as a new user, and the corresponding user identification is stored in a user identification library for the subsequent identification of the user.
And fourthly, in the process of using the document processing software to browse the document by the user, the document processing software judges whether the document page currently browsed by the user is a content typesetting page, if the document page currently browsed by the user is a content typesetting page, the stay position data generated in the cursor moving process are acquired, then target position data are screened out from the stay position data, the mass center of the target position data is calculated, and finally, the document processing software generates a user identification corresponding to the user according to the mass center and sends the user identification to a server. The server uses the user identification to identify the user, namely, whether the same user identification exists or not is searched from a user identification library; if the identification is unsuccessful, the user identification corresponding to the user identification is used as a new user, and the corresponding user identification is stored in a user identification library for the subsequent identification of the user.
In the first to fourth scenes, the actions of judging whether the page currently browsed by the user is a content typesetting page, acquiring the stay position data generated in the cursor movement process, screening the target position data from the stay position data, calculating the centroid, generating the user identifier corresponding to the user according to the centroid, and sending the user identifier to the server may also be executed by software running in the background of the operating system, such as security software, chatting software, input method, and the like. For example, for scenario one through scenario three, the data implementation of the browser or browse window may be obtained by software running in the operating system background, and for scenario four, the data implementation of the document processing software may be obtained by software running in the operating system background.
In the above scenario, the user identification may be generated by the terminal or by the server. For example, in the first scene, after the browser screens out the target position data from the stay position data, the target position data is sent to the server, the server performs calculation of the centroid and generation of the user identifier, or the browser sends the centroid to the server, and the server performs generation of the user identifier; in the second scene, after the browser plug-in screens out target position data from the stay position data, the target position data is sent to the server, the server executes the calculation of the mass center and the generation of the user identification, or the browser plug-in sends the mass center to the server, and the server executes the generation of the user identification; in a third scene, after the third party software screens out target position data from the stay position data, the target position data is sent to a server, the server executes the calculation of the mass center and the generation of the user identification, or the third party software sends the mass center to the server, and the server executes the generation of the user identification; in the fourth scene, the document processing software screens out target position data from the stay position data, then sends the target position data to the server, the server executes the calculation of the mass center and the generation of the user identification, or the document processing software sends the mass center to the server, and the server executes the generation of the user identification; in the fifth scene, after the software running independently in the background screens out the target position data from the stay position data, the target position data is sent to the server, the server executes the calculation of the centroid and the generation of the user identifier, or the software running independently in the background sends the centroid to the server, and the server executes the generation of the user identifier.
Fig. 24 is a block diagram of a user identifier generating device according to an embodiment of the present application, where the device includes:
the judging module 2401 is configured to judge whether a currently displayed page is a content-type typesetting page, where a content display area of the content-type typesetting page is located in the middle of the content-type typesetting page along a width direction, and a white area exists between two sides of the content display area along the width direction and an edge of the content-type typesetting page;
the position data obtaining module 2402 is configured to obtain, when the page is a content-type typesetting page, first position data of a cursor in the content-type typesetting page, where the first position data is position data of a time period when the cursor stays in the content-type typesetting page, where the time period is greater than a first threshold;
a target position data generating module 2403, configured to obtain target position data according to the stay position data of the cursor in the page;
the identifier generating module 2404 is configured to generate a user identifier for identifying a user according to the coordinates obtained by the target location data.
According to the user identifier generating device provided by the embodiment of the application, the judging module 2401 is used for judging whether the currently displayed page is the content typesetting page, when the page is the content typesetting page, the position data acquiring module 2402 acquires the stay position data of the cursor in the page, the target position data generating module 2403 acquires the target position data, the identifier generating module 2404 generates the user identifier according to the coordinates acquired by the target position data, and as the typesetting design of the content typesetting page aims at displaying text content, the user has specific behavior habit when browsing the typesetting page, the target position data can accurately reflect the behavior habit of the user when browsing the webpage, the user identifier generated by the target position data can be directly related to the behavior habit of the user and does not depend on static information of a browser, even if the user replaces a browsing program or computer equipment, the operation habit of the user browsing the content typesetting page of the user is consistent or relevant, and therefore the user identifier generated by different browsing programs and even different computer equipment is consistent or relevant.
Fig. 25 is a block diagram of a subscriber identity module according to an embodiment of the present application, where the subscriber identity module includes:
a user identifier receiving module 2501, configured to obtain a user identifier generated by the user identifier generating device in the above embodiment;
a matching module 2502, configured to match the user identifier in a preset user identifier feature library;
a confirmation module 2503, configured to complete identification of the end user after successful matching;
and a new module 2504, configured to store the user identifier as a new user identifier feature in the user identifier feature library after the matching fails.
In the user identification device provided by the embodiment of the present application, the user identification is obtained by the user identification receiving module 2504, where the user identification is based on target position data obtained according to the stay position data of the cursor in the page, and then the user identification is generated by using the target position data. Because the coordinates corresponding to the target position data reflect the behavior habit of the user for browsing the content typesetting webpage, the obtained user identification can be directly related to the behavior habit of the user and does not depend on static information of a browser, even if the user changes a browsing program or changes computer equipment, the operation habit of the user for browsing the content typesetting webpage is consistent, and therefore, the user identifications generated by different browsing programs and even different computer equipment are consistent or related, and therefore, the accuracy of user identification can be improved by utilizing the user identification.
Fig. 26 is a block diagram of a user identifier generating device according to an embodiment of the present application, where the device includes: at least one processor 2601, at least one memory 2602, and at least one program stored on the memory 2602 and executable on the processor 2601 to implement the user identification generation method in the above embodiment.
The processor 2601 and the memory 2602 may be connected by a bus or otherwise, for example in fig. 26.
Fig. 27 is a block diagram of a subscriber identity module according to an embodiment of the present application, where the subscriber identity module includes: at least one processor 2701, at least one memory 2702, and at least one program stored on the memory 2702 and executable on the processor 2701 to implement the user recognition method in the above-described embodiment.
The processor 2701 and the memory 2702 may be connected by a bus or otherwise, for example in fig. 27.
Referring to fig. 28, an embodiment of the present application further provides a terminal, where the terminal includes the user identifier generating device in the foregoing embodiment.
Referring to fig. 29, an embodiment of the present application further provides a server, which includes the user identifying apparatus in the above embodiment.
An embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions that are executed by a processor or controller, for example, by a processor 2601 in fig. 26, which may cause the above-described processor 2601 to perform the user identification generation method in the embodiment of the present application, for example, to perform the above-described method steps 401 to 404 in fig. 4, method steps 601 to 603 in fig. 6, method steps 701 to 702 in fig. 7, method steps 801 to 804 in fig. 8, method steps 1001 to 1002 in fig. 10, method steps 1101 to 1102 in fig. 11, method steps 1201 to 1202 in fig. 12, method steps 1301 to 1302 in fig. 13, method steps 1602 to 1602 in fig. 16, method steps 1801 to 1802 in fig. 18, method steps 1901 to 1902 in fig. 19, method steps 2001 to 2003 in fig. 20, and method steps 2201 to 2211 in fig. 22.
As another example, execution by one processor 2701 in fig. 27 may cause the processor 2701 to execute the user identification method in the embodiment of the present invention, for example, to execute the method steps 2101 to 2104 in fig. 21 described above.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. While the preferred embodiments of the present application have been described in detail, the present application is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.

Claims (21)

1. A user identification generation method, comprising:
judging whether the currently displayed page is a content typesetting page or not;
when the page is a content typesetting page, acquiring stay position data of a cursor in the page, wherein the stay position data is position data of which the stay time of the cursor in the page is longer than a first threshold value;
obtaining target position data according to the stay position data of the cursor in the page;
generating a user identifier for identifying a user according to the coordinates obtained by the target position data;
the generating a user identifier for identifying a user according to the coordinates obtained by the target position data comprises the following steps: the target position data at least comprises coordinates of stay positions of cursors in the page, the coordinates of the stay positions are analyzed to obtain coordinate distribution of each cursor stay point in the page, and conversion is carried out through the coordinate distribution to obtain user identification;
the obtaining target position data according to the stay position data of the cursor in the page comprises the following steps:
taking the acquired stay position data of the cursor as first position data;
and removing second position data from the first position data to obtain target position data, wherein the second position data comprises at least one of the following:
The dwell position data of the cursor corresponding to the content display area of the page;
the stay position data of the cursor corresponding to the content display partition in the content display area of the page;
the stay position data of the cursor corresponding to the area where the interactive element of the page is located;
the stay position data of the cursor corresponding to the interference area with the distribution density smaller than a third threshold value;
the dwell position data of the cursor with the longest dwell time;
and sequencing according to the order of the stay time of the cursor on the page from long to short, and ranking the stay position data after the fourth threshold value.
2. The method of claim 1, wherein the step of determining whether the currently displayed page is a content-type layout page comprises one of:
judging whether the access address of the page is matched with a preset page address, if so, the page is a content typesetting page;
identifying a content display area in the page, and if the content display area is positioned in the middle of the page and a white area with the width of the second threshold value larger than the width ratio of the content display area to the page exists between two sides of the content display area along the width direction and the edge of the page, the page is a content typesetting page;
Judging whether the access address of the page is matched with a preset page address, if so, the page is a content typesetting page; if the two sides of the content display area along the width direction and the edge of the page are provided with a white area with the width occupying the width of the page being larger than a second threshold value, the page is a content typesetting page.
3. The method of claim 1, wherein the obtaining dwell position data of the cursor in the page comprises:
collecting coordinates of the cursor in the page;
and if the stay time of the cursor at the coordinates is longer than a first threshold value, taking the position data corresponding to the coordinates as stay position data of the cursor.
4. The method of claim 1, wherein the obtaining dwell position data of the cursor in the page comprises:
responding to the movement of a cursor, and collecting a plurality of continuous coordinates and corresponding time points in the moving process of the cursor in the page;
and if the time difference between the two adjacent coordinates is larger than the first threshold value, taking the position data corresponding to the previous coordinate in the two adjacent coordinates as the stay position data of the cursor.
5. The method of claim 4, wherein the obtaining the plurality of coordinates and the corresponding time in the moving of the cursor within the page comprises:
and acquiring a plurality of coordinates and corresponding time points continuously in the moving process of the cursor in the page according to a preset acquisition rate.
6. The method of claim 1, wherein the coordinates refer to a proportional value of a position of the cursor relative to a window size of the page.
7. The method according to any one of claims 1 to 6, wherein the generating a user identification for identifying a user from the coordinates obtained from the target location data comprises:
obtaining mass centers of a plurality of target position data according to the coordinates of the target position data;
and generating a user identification for user identification according to the coordinates of the centroid.
8. The method of claim 7, wherein the centroids of the plurality of target location data are obtained by a mean shift algorithm.
9. A method according to any one of claims 1 to 8, wherein the page is displayed in a browser.
10. The method according to any one of claims 1 to 8, further comprising:
Sending verification information containing the user identification to a server;
and acquiring verification feedback information from the server, wherein the verification feedback information is used for feeding back whether the user identification is matched with the user in the server.
11. The method of claim 10, further comprising at least one of:
when the verification feedback information indicates that a matched user exists, the user identification is sent to the server;
and when the verification feedback information indicates that no matched user exists, continuously acquiring the stay position data and regenerating the user identification until the matched user is successfully matched or a preset matching condition is exceeded.
12. The method of claim 11, wherein the continuing to collect the stay location data and regenerating the user identification comprises:
reserving the existing acquired stay position data as a first stay position data set;
continuously acquiring new stay position data to generate a second stay position data set;
regenerating a user identification from the first dwell position data set and the second dwell position data set.
13. The method according to claim 11 or 12, wherein the preset matching conditions comprise at least one of:
Presetting matching times;
presetting a matching time.
14. A user identification method, comprising:
acquiring a user identity generated by the method of any one of claims 1 to 13;
matching the user identification in a preset user identification feature library; the method comprises the steps of,
further comprising at least one of the following steps:
when the matching is successful, the identification of the user is completed;
and when the matching is failed, the user identification is used as a new user identification characteristic and is stored in the user identification characteristic library.
15. A user identification generating device comprises,
the judging module is used for judging whether the currently displayed page is a content type typesetting page or not, wherein a content display area of the content type typesetting page is positioned in the middle of the content type typesetting page along the width direction, and a white-keeping area exists between two sides of the content display area along the width direction and the edge of the content type typesetting page;
the position data acquisition module is used for acquiring first position data of a cursor in the content type typesetting page when the page is the content type typesetting page, wherein the first position data is position data of which the length is larger than a first threshold value when the cursor stays in the content type typesetting page;
The target position data generation module is used for obtaining target position data according to the stay position data of the cursor in the page; the obtaining target position data according to the stay position data of the cursor in the page comprises the following steps: taking the acquired stay position data of the cursor as first position data; and removing second position data from the first position data to obtain target position data, wherein the second position data comprises at least one of the following: the dwell position data of the cursor corresponding to the content display area of the page; the stay position data of the cursor corresponding to the content display partition in the content display area of the page; the stay position data of the cursor corresponding to the area where the interactive element of the page is located; the stay position data of the cursor corresponding to the interference area with the distribution density smaller than a third threshold value; the dwell position data of the cursor with the longest dwell time; ranking the stay position data after a fourth threshold according to the order of the stay time of the cursor on the page from long to short;
The identification generation module is used for generating a user identification for identifying a user according to the coordinates obtained by the target position data; the generating a user identifier for identifying a user according to the coordinates obtained by the target position data comprises the following steps: the target position data at least comprises coordinates of stay positions of cursors in the page, the coordinates of the stay positions are analyzed to obtain coordinate distribution of each cursor stay point in the page, and conversion is carried out through the coordinate distribution to obtain user identification.
16. A user identification device, comprising:
a user identifier receiving module, configured to obtain the user identifier generated by the user identifier generating device according to claim 15;
the matching module is used for matching the user identification in a preset user identification feature library;
the confirmation module is used for completing the identification of the terminal user after the successful matching;
and the new building module is used for storing the user identification as a new user identification characteristic into the user identification characteristic library after the matching fails.
17. A user identification generation apparatus, comprising:
at least one memory;
at least one processor;
at least one program;
The program is stored in a memory, and the processor executes the at least one program to implement the user identification generation method of any of claims 1-13.
18. A user identification device, comprising:
at least one memory;
at least one processor;
at least one program;
the program is stored in a memory, and the processor executes the at least one program to implement the user identification method of claim 14.
19. A terminal comprising the user identification generating device according to claim 15.
20. A server comprising the user identification device of claim 16.
21. A computer readable storage medium storing computer executable instructions for performing the user identification generation method of any one of claims 1-13, or the user identification method of claim 14.
CN202010190953.4A 2020-03-18 2020-03-18 User identification generation method, user identification method and device Active CN111400575B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010190953.4A CN111400575B (en) 2020-03-18 2020-03-18 User identification generation method, user identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010190953.4A CN111400575B (en) 2020-03-18 2020-03-18 User identification generation method, user identification method and device

Publications (2)

Publication Number Publication Date
CN111400575A CN111400575A (en) 2020-07-10
CN111400575B true CN111400575B (en) 2023-06-23

Family

ID=71428919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010190953.4A Active CN111400575B (en) 2020-03-18 2020-03-18 User identification generation method, user identification method and device

Country Status (1)

Country Link
CN (1) CN111400575B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112697113A (en) * 2020-12-10 2021-04-23 四川长虹电器股份有限公司 Method for displaying disaster data change situation of mass sensors
CN114244826B (en) * 2022-01-18 2023-11-28 杭州盈高科技有限公司 Webpage identification information sharing method and device, storage medium and processor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103944722A (en) * 2014-04-17 2014-07-23 华北科技学院 Identification method for user trusted behaviors under internet environment
US8914496B1 (en) * 2011-09-12 2014-12-16 Amazon Technologies, Inc. Tracking user behavior relative to a network page
CN105760516A (en) * 2016-02-25 2016-07-13 广州视源电子科技股份有限公司 Method and device for distinguishing users
CN110188275A (en) * 2019-05-30 2019-08-30 广州虎牙信息科技有限公司 A kind of browsing monitoring method, device, equipment and the storage medium of web page element

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101446979A (en) * 2008-12-26 2009-06-03 北京科尔威视网络科技有限公司 Method for dynamic hotspot tracking
CN101833619A (en) * 2010-04-29 2010-09-15 西安交通大学 Method for judging identity based on keyboard-mouse crossed certification
US20130246383A1 (en) * 2012-03-18 2013-09-19 Microsoft Corporation Cursor Activity Evaluation For Search Result Enhancement

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8914496B1 (en) * 2011-09-12 2014-12-16 Amazon Technologies, Inc. Tracking user behavior relative to a network page
CN103944722A (en) * 2014-04-17 2014-07-23 华北科技学院 Identification method for user trusted behaviors under internet environment
CN105760516A (en) * 2016-02-25 2016-07-13 广州视源电子科技股份有限公司 Method and device for distinguishing users
CN110188275A (en) * 2019-05-30 2019-08-30 广州虎牙信息科技有限公司 A kind of browsing monitoring method, device, equipment and the storage medium of web page element

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于用户鼠标行为的身份认证方法;陈功;朱佳俊;施勇;薛质;;常州大学学报(自然科学版)(02);全文 *
基于鼠标动力学模型的用户身份认证与监控;房超;蔡忠闽;沈超;牛非;管晓宏;;西安交通大学学报(10);全文 *

Also Published As

Publication number Publication date
CN111400575A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
KR102455232B1 (en) Method and electronic device for tab management based on context
CN107784516B (en) Advertisement putting method and device
US9756140B2 (en) Tracking user behavior relative to a network page
US10515142B2 (en) Method and apparatus for extracting webpage information
US20160077695A1 (en) Methods, Systems, And Computer Program Products For Grouping Tabbed Portions Of A Display Object Based On Content Relationships And User Interaction Levels
US10353721B2 (en) Systems and methods for guided live help
US9329759B1 (en) Customized content display and interaction
CN106844635B (en) Method and device for editing elements in webpage
JP2017515216A (en) System and method for optimizing content layout using behavioral metric
US11122142B2 (en) User behavior data processing method and device, and computer-readable storage medium
US20190339820A1 (en) Displaying a subset of menu items based on a prediction of the next user-actions
US20150254219A1 (en) Method and system for injecting content into existing computerized data
CN108762837A (en) Application program preloads method, apparatus, storage medium and terminal
CN111400575B (en) User identification generation method, user identification method and device
CN112269917B (en) Media resource display method and device, equipment, system and storage medium
GB2558870A (en) Internet browsing
CN106202368B (en) Preloading method and device
CN111124564A (en) Method and device for displaying user interface
CN112699295A (en) Webpage content recommendation method and device and computer readable storage medium
CN111538557B (en) Barrage rendering method based on cascading style sheet and related equipment
CN104598467B (en) Webpage picture display method and device
CN115145455A (en) Data interaction method and device, electronic equipment and storage medium
US20140337709A1 (en) Method and apparatus for displaying web page
CN113360737A (en) Page content acquisition method and device, electronic equipment and readable medium
US20160110427A1 (en) Apparatus, system, and method for organizing and embedding applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40026383

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant