CN110909245A - Multi-label webpage searching method, browser, server and storage medium - Google Patents

Multi-label webpage searching method, browser, server and storage medium Download PDF

Info

Publication number
CN110909245A
CN110909245A CN201911205658.5A CN201911205658A CN110909245A CN 110909245 A CN110909245 A CN 110909245A CN 201911205658 A CN201911205658 A CN 201911205658A CN 110909245 A CN110909245 A CN 110909245A
Authority
CN
China
Prior art keywords
webpage
voice information
keywords
web page
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911205658.5A
Other languages
Chinese (zh)
Inventor
陈顺利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hanzi Technology Co Ltd
Original Assignee
Beijing Hanzi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hanzi Technology Co Ltd filed Critical Beijing Hanzi Technology Co Ltd
Priority to CN201911205658.5A priority Critical patent/CN110909245A/en
Publication of CN110909245A publication Critical patent/CN110909245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses a multi-label webpage searching method, a browser, a server and a storage medium. The method comprises the following steps: acquiring voice information of a user; matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information; and displaying the webpage with the maximum similarity. According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.

Description

Multi-label webpage searching method, browser, server and storage medium
Technical Field
The embodiment of the invention relates to a web browser technology, in particular to a multi-label web searching method, a browser, a server and a storage medium.
Background
When a user searches for data by using a web browser each time, a large number of tags are opened, and the tags are required to be switched in a complicated manner to find the content which the user wants to find, so that the user is difficult to find the tag in which the user wants to find the content, and the searching efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a multi-label webpage searching method, a browser, a server and a storage medium, so that contents which a user wants to search can be quickly found when a large number of webpage labels are opened, and the searching efficiency is improved.
In a first aspect, an embodiment of the present invention provides a multi-tag webpage searching method, including:
acquiring voice information of a user;
matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information;
and displaying the webpage with the maximum similarity.
Optionally, the matching the content information of each webpage to which the voice information and the multiple tags point respectively to determine the webpage with the maximum similarity to the voice information includes:
calculating the similarity of the content information of each webpage respectively pointed by the voice information and the plurality of labels;
and confirming the webpage with the maximum similarity with the voice information according to the similarity.
Optionally, the calculating the similarity of the content information of each webpage to which the voice information and the tags point respectively includes:
converting the speech information into sentence vectors
Figure BDA0002296868800000021
Respectively converting the content information of each webpage into a vector
Figure BDA0002296868800000022
Vector the sentence
Figure BDA0002296868800000023
Vector with content information of each web page
Figure BDA0002296868800000024
Multiplying to obtain the similarity;
the confirming the webpage with the maximum similarity with the voice information according to the similarity comprises the following steps:
taking a maximum value among the similarities
Figure BDA0002296868800000025
Confirming the webpage with the maximum similarity with the voice information:
Figure BDA0002296868800000026
optionally, the method further includes:
after displaying the webpage with the maximum similarity, acquiring description keywords input by a user in a plurality of searching processes and/or description keywords of reading webpage marks corresponding to the plurality of searching processes;
inputting the description keywords into a training model trained in advance, and outputting result keywords;
and displaying the result keywords on the webpage with the maximum similarity or a preset area of the current page.
Optionally, before obtaining the description keyword input by the user in the multiple search processes and/or the description keyword of the reading webpage mark corresponding to the multiple search processes, the training of the training model further includes:
collecting a large number of description keywords and result keywords of a specific field;
marking the description keywords by using the result keywords to generate a training sample set;
and inputting each description keyword of the training sample set into a training model for training.
Optionally, after the training of the training model, detecting the training model further includes:
collecting a large number of description keywords and result keywords of a specific field;
marking the description keywords by using the result keywords to generate a detection sample set;
inputting each description keyword of the detection sample set into a training model for detection so as to output a detection result;
and confirming whether the training model needs to be trained continuously or not according to the matching degree of the detection result and the result keyword.
In a second aspect, an embodiment of the present invention further provides a browser, including:
the acquisition unit is used for acquiring voice information of a user;
the matching unit is used for matching the voice information with the content information of each webpage pointed by the labels respectively and confirming the webpage with the maximum similarity with the voice information;
and the display unit is used for displaying the webpage with the maximum similarity.
In a third aspect, an embodiment of the present invention further provides a server, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the multi-tag web page searching method described in any of the foregoing embodiments when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the multi-tag web page searching method in any one of the foregoing embodiments.
According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.
Drawings
Fig. 1 is a schematic flowchart of a multi-tag web page searching method according to a first embodiment of the present invention;
fig. 2 is a schematic flowchart of a multi-tag web page searching method according to a second embodiment of the present invention;
FIG. 3 is a diagram illustrating a model for generating answers according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a browser in a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a server in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Furthermore, the terms "first," "second," and the like may be used herein to describe various orientations, actions, steps, elements, or the like, but the orientations, actions, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, a first tag may be termed a second tag, and, similarly, a second tag may be termed a first tag, without departing from the scope of the present application. The first label and the second label are both labels, but they are not the same label. The terms "first", "second", etc. are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Example one
According to the technical scheme of the first embodiment of the invention, the webpage is searched by using the voice vector to replace the searching problem of the multi-label webpage, so that the problem that the content cannot be seen by the multi-label webpage is solved, and the intelligent webpage questioning experience is realized. Fig. 1 is a schematic flowchart of a multi-tag webpage searching method according to an embodiment of the present invention, which is applicable to a webpage searching situation. The method of the embodiment of the invention can be executed by a multi-label web page searching device, which can be realized by software and/or hardware, and can be generally integrated in a browser, a server or terminal equipment. Referring to fig. 1, a method for searching a multi-tag webpage according to an embodiment of the present invention specifically includes the following steps:
and step S110, acquiring voice information of the user.
Specifically, the voice information of the user can be acquired through a mobile phone APP or a microphone of a computer end, and voice recognition is performed. For example, when browsing a web page, a user opens a plurality of web pages, such as a large stack of web pages like "qin", "chess", "book", "drawing", etc., and the user suddenly wants to find out the "qin" web page, and does not know where the label of the web page is, and it is too troublesome to find out one by one, so the user can speak out "qin" or voice information related to "qin" through a microphone at a computer end.
And step S120, matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information.
After the voice information of the user is acquired, the voice information is converted into character information in a specific format or symbol information in a specific type and the like, and similarly, the content information of each webpage respectively pointed by the tags is converted into the character information in the specific format or the symbol information in the specific type and the like, and the character information, the symbol information and the like are matched to confirm the webpage with the maximum voice information similarity. For example, after a user speaks voice information 'qin' to a microphone, the voice information is recognized and converted into character information, the character information is matched with content information of each webpage respectively pointed by a plurality of labels, and the webpage with the maximum similarity to the 'qin' is found.
The matching mode of the present invention is described below, specifically as follows:
step S1201, calculating similarity of the content information of each webpage to which the voice information and the plurality of tags point respectively.
Step A, converting the voice information into sentence vector
Figure BDA0002296868800000061
Specifically, after the microphone acquires the voice information, the voice signal is converted into an electric signal, and then the electric signal is converted into a sentence vector
Figure BDA0002296868800000062
Step B, converting the content information of each webpage into vectors respectively
Figure BDA0002296868800000063
Specifically, the content information of each web page is extracted, which may include text information, picture information, video information, and the like, or may be only text information, and the content information is converted into vectors respectively
Figure BDA0002296868800000064
Corresponding to web page 1, web page 2, … …, and web page n, respectively.
Step C, the sentence vector
Figure BDA0002296868800000065
Vector with content information of each web page
Figure BDA0002296868800000066
The multiplication is carried out in such a way that,to obtain the similarity.
Specifically, a sentence vector is obtained
Figure BDA0002296868800000067
Sum vector
Figure BDA0002296868800000068
Then, the sentence is vector
Figure BDA0002296868800000069
And vector
Figure BDA00022968688000000610
Multiplying to obtain similarity respectively
Figure BDA00022968688000000611
Corresponding to web page 1, web page 2, … …, web page n.
And step S1202, confirming the webpage with the maximum similarity to the voice information according to the similarity.
Specifically, the maximum value is taken out of the similarity
Figure BDA0002296868800000071
Confirming the webpage with the maximum similarity with the voice information:
Figure BDA0002296868800000072
Figure BDA0002296868800000073
the corresponding web page is the web page with the maximum similarity.
And step S130, displaying the webpage with the maximum similarity.
Specifically, the web page with the maximum similarity is found, the web page can be directly popped up, the web page can be prompted through different colors, and whether the web page is displayed or not is determined by the user.
According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.
Example two
For the exploratory data searching process, a large amount of domain-specific languages are needed, and a user cannot express the problem of the specific domain, so that useful data are difficult to search. Fig. 2 is a schematic flowchart of a multi-tag webpage searching method according to a second embodiment of the present invention. The method of the embodiment of the invention can be executed by a multi-label web page searching device, which can be realized by software and/or hardware, and can be generally integrated in a server or a terminal device. Referring to fig. 2, a method for searching a multi-tag webpage according to an embodiment of the present invention specifically includes the following steps:
and step S210, acquiring the voice information of the user.
Step S220, matching the content information of each webpage pointed by the voice information and the plurality of labels, and determining the webpage with the maximum similarity to the voice information.
And step S230, displaying the webpage with the maximum similarity.
Step S240, after displaying the web page with the maximum similarity, obtaining the description keyword input by the user in the multiple search processes and/or the description keyword of the reading web page mark corresponding to the multiple search processes.
Specifically, the description keyword refers to a term in some areas of expertise that the user wants to input in the process of searching for data exploratory, but it is unclear what the term is, and the keyword is attempted to be input. For example, the user wants to find out what is the "blockchain", but does not know the word, and inputs the information of "a decentralized distributed account book database", "the concatenated text records which are cryptographically concatenated and protect the content", and the like, and at this time, "a decentralized distributed account book database", "the concatenated text records which are cryptographically concatenated and protect the content" is the descriptionA keyword. In the embodiment of the present invention, the
Figure BDA0002296868800000081
A plurality of domain-specific description keywords that represent user attempts to input are used as vectors,
Figure BDA0002296868800000082
is a vector representing the answer keywords intended by the user. It can be understood that the time for acquiring the description keyword input by the user in the multiple search processes is not limited, and may be acquired after displaying the web page with the maximum similarity in the first implementation, or may be acquired at other times.
And S250, inputting the description keywords into a training model trained in advance, and outputting result keywords.
Specifically, the description keywords are input into a training model trained in advance, and result keywords are output. Based on the model, when a user inputs a plurality of professional field exclusive descriptors, the answer of the user is automatically and intelligently matched and found
Figure BDA0002296868800000083
Vector quantity;
Figure BDA0002296868800000084
annotation data derived from a user's descriptors of a search question multiple times,
Figure BDA0002296868800000085
results from the user labeling of the results. Can be expressed by the following formula:
Figure BDA0002296868800000086
wherein the content of the first and second substances,
Figure BDA0002296868800000087
a description keyword representing a user input,
Figure BDA0002296868800000088
indicating the result key words to be output.
Or can be represented schematically as shown in FIG. 3, d1Representing a decentralized distributed ledger database, d2The recommendation model is a recommendation model for automatically generating answers by multi-description input, wherein the model adopts description marks and answer mark data used by multiple persons as a training set, and r represents what is a block chain.
And S260, displaying the result keywords on the webpage with the maximum similarity or in a preset area of the current page.
After the result keywords are generated, the result keywords can be displayed on the webpage with the maximum similarity or in a preset area of the current page. The preset area can be a user-defined area or a system default area.
Generally, before the training model is used to output the result keywords, the training model needs to be trained, and the calculation parameters of the model are adjusted through training, so that the result keywords are more accurately output when being used. When the user uses the model, the input description keywords and the output result keywords are recorded in the background so as to train the model to use. And training the model by using the domain mark description keyword d and the result keyword mark r to help the user to quickly predict the answer keyword of the search. Training the training model comprises:
step A, collecting a large number of description keywords and result keywords in a specific field.
Specifically, the method can be used for collecting the words according to the input condition of the user in the using process, and each word searched by the user is recorded as a training sample set.
And B, marking the description keywords by using the result keywords to generate a training sample set.
And step C, inputting each description keyword of the training sample set into a training model for training.
After training of the training model is completed, the model also needs to be tested. The detection of the training model comprises:
step a, collecting a large number of description keywords and result keywords of specific fields.
Specifically, the collected sample data for detection is different from the sample data for training, and the sample size for detection can be smaller. For example, 70% of the data is used for training and 30% of the data is used for detection, which can be adjusted according to actual conditions.
And b, marking the description keywords by using the result keywords to generate a detection sample set.
And c, inputting each description keyword of the detection sample set into a training model for detection so as to output a detection result.
And d, confirming whether the training model needs to be trained continuously or not according to the matching degree of the detection result and the result keyword.
Specifically, if the obtained detection result is not much different from the result keyword, it indicates that the training model does not need to be trained continuously; if the obtained detection result is greatly different from the result keywords, the training model needs to be trained continuously.
According to the technical scheme of the embodiment of the invention, the user can be helped to quickly predict the search answer through the labeling data of the multiple descriptors and the result keywords.
EXAMPLE III
The multi-tag webpage searching device provided by the embodiment of the invention can execute the webpage searching method provided by any embodiment of the invention, has corresponding functional modules and beneficial effects of the execution method, can be realized in a software and/or hardware (integrated circuit) mode, and can be generally integrated in a browser, a server or terminal equipment. Fig. 4 is a schematic structural diagram of a multi-tag web page searching device or browser according to a third embodiment of the present invention. Referring to fig. 4, a multi-tag web page searching apparatus or browser according to an embodiment of the present invention may specifically include:
an obtaining unit 410, configured to obtain voice information of a user;
a matching unit 420, configured to match the content information of each webpage to which the voice information and the multiple tags point respectively, and determine a webpage with the largest similarity to the voice information;
the display unit 430 is configured to display the web page with the largest similarity.
Optionally, the matching unit 420 is further configured to: calculating the similarity of the content information of each webpage respectively pointed by the voice information and the plurality of labels; and confirming the webpage with the maximum similarity with the voice information according to the similarity.
Optionally, the calculating the similarity of the content information of each webpage to which the voice information and the tags point respectively includes:
converting the speech information into sentence vectors
Figure BDA0002296868800000111
Respectively converting the content information of each webpage into a vector
Figure BDA0002296868800000112
Vector the sentence
Figure BDA0002296868800000113
Vector with content information of each web page
Figure BDA0002296868800000114
Multiplying to obtain the similarity;
the confirming the webpage with the maximum similarity with the voice information according to the similarity comprises the following steps:
taking a maximum value among the similarities
Figure BDA0002296868800000115
Confirming the webpage with the maximum similarity with the voice information:
Figure BDA0002296868800000116
optionally, the apparatus further comprises:
the description unit is used for acquiring the description keywords input by the user in the multiple searching processes and/or the description keywords of the reading webpage marks corresponding to the multiple searching processes after the webpage with the maximum similarity is displayed;
the result unit is used for inputting the description keywords into a training model trained in advance and outputting result keywords;
and the display unit is used for displaying the result keywords on the webpage with the maximum similarity or in a preset area of the current page.
Optionally, before obtaining the description keyword input by the user in the multiple search processes and/or the description keyword of the reading webpage mark corresponding to the multiple search processes, the training of the training model further includes: collecting a large number of description keywords and result keywords of a specific field; marking the description keywords by using the result keywords to generate a training sample set; and inputting each description keyword of the training sample set into a training model for training.
Optionally, after the training of the training model, detecting the training model further includes: collecting a large number of description keywords and result keywords of a specific field; marking the description keywords by using the result keywords to generate a detection sample set; inputting each description keyword of the detection sample set into a training model for detection so as to output a detection result; and confirming whether the training model needs to be trained continuously or not according to the matching degree of the detection result and the result keyword.
According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.
Example four
Fig. 5 is a schematic structural diagram of a server according to a fourth embodiment of the present invention, as shown in fig. 5, the server includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the server may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 in the server may be connected by a bus or other means, and the bus connection is exemplified in fig. 5.
The memory 520 may be used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the multi-tag web page searching method in the embodiment of the present invention (for example, the obtaining unit 410, the matching unit 420, and the display unit 430 in the multi-tag web page searching apparatus). The processor 510 executes various functional applications of the server and data processing by executing software programs, instructions and modules stored in the memory 520, that is, implements the multi-tag web page searching method described above.
Namely:
acquiring voice information of a user;
matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information;
and displaying the webpage with the maximum similarity.
Of course, the processor of the server provided in the embodiment of the present invention is not limited to execute the method operations described above, and may also execute related operations in the multi-tag web page searching method provided in any embodiment of the present invention.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 520 may further include memory located remotely from processor 510, which may be connected to a server over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the server. The output device 540 may include a display device such as a display screen.
According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform a multi-tag web page lookup method, where the method includes:
acquiring voice information of a user;
matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information;
and displaying the webpage with the maximum similarity.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the multi-tag web page searching method provided by any embodiment of the present invention.
The computer-readable storage media of embodiments of the invention may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or terminal. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
According to the technical scheme of the embodiment of the invention, the voice information is acquired and matched with the content information of each webpage, so that a user can conveniently and quickly find the content which the user wants to find when opening a large number of webpage labels, and the searching efficiency is improved.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (9)

1. A multi-label web page searching method is characterized by comprising the following steps:
acquiring voice information of a user;
matching the voice information with the content information of each webpage pointed by the plurality of labels respectively, and confirming the webpage with the maximum similarity with the voice information;
and displaying the webpage with the maximum similarity.
2. The method for searching for a multi-tag webpage according to claim 1, wherein the matching the content information of each webpage to which the voice information and the tags point respectively to determine the webpage with the maximum similarity to the voice information comprises:
calculating the similarity of the content information of each webpage respectively pointed by the voice information and the plurality of labels;
and confirming the webpage with the maximum similarity with the voice information according to the similarity.
3. The method for searching for a multi-tag web page according to claim 2, wherein the calculating the similarity between the voice information and the content information of each web page to which the plurality of tags point respectively comprises:
converting the speech information into sentence vectors
Figure FDA0002296868790000011
Respectively converting the content information of each webpage into a vector
Figure FDA0002296868790000012
Vector the sentence
Figure FDA0002296868790000013
Vector with content information of each web page
Figure FDA0002296868790000014
Multiplying to obtain the similarity;
the confirming the webpage with the maximum similarity with the voice information according to the similarity comprises the following steps:
taking a maximum value among the similarities
Figure FDA0002296868790000015
Confirming the webpage with the maximum similarity with the voice information:
Figure FDA0002296868790000016
4. the multi-tag web page lookup method of claim 1, further comprising:
after displaying the webpage with the maximum similarity, acquiring description keywords input by a user in a plurality of searching processes and/or description keywords of reading webpage marks corresponding to the plurality of searching processes;
inputting the description keywords into a training model trained in advance, and outputting result keywords;
and displaying the result keywords on the webpage with the maximum similarity or a preset area of the current page.
5. The method for finding the multi-label web page according to claim 4, wherein before obtaining the description keywords input by the user in the multiple search processes and/or the description keywords of the reading web page tags corresponding to the multiple search processes, the training of the training model further comprises:
collecting a large number of description keywords and result keywords of a specific field;
marking the description keywords by using the result keywords to generate a training sample set;
and inputting each description keyword of the training sample set into a training model for training.
6. The method for searching for a multi-label web page according to claim 5, wherein after the training of the training model, the method further comprises detecting the training model, and the detecting the training model comprises:
collecting a large number of description keywords and result keywords of a specific field;
marking the description keywords by using the result keywords to generate a detection sample set;
inputting each description keyword of the detection sample set into a training model for detection so as to output a detection result;
and confirming whether the training model needs to be trained continuously or not according to the matching degree of the detection result and the result keyword.
7. A browser, comprising:
the acquisition unit is used for acquiring voice information of a user;
the matching unit is used for matching the voice information with the content information of each webpage pointed by the labels respectively and confirming the webpage with the maximum similarity with the voice information;
and the display unit is used for displaying the webpage with the maximum similarity.
8. A server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the multi-tag web page lookup method according to any one of claims 1-6 when executing the computer program.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a multi-tag web page lookup method according to any one of claims 1-6.
CN201911205658.5A 2019-11-29 2019-11-29 Multi-label webpage searching method, browser, server and storage medium Pending CN110909245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911205658.5A CN110909245A (en) 2019-11-29 2019-11-29 Multi-label webpage searching method, browser, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911205658.5A CN110909245A (en) 2019-11-29 2019-11-29 Multi-label webpage searching method, browser, server and storage medium

Publications (1)

Publication Number Publication Date
CN110909245A true CN110909245A (en) 2020-03-24

Family

ID=69820877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911205658.5A Pending CN110909245A (en) 2019-11-29 2019-11-29 Multi-label webpage searching method, browser, server and storage medium

Country Status (1)

Country Link
CN (1) CN110909245A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101206673A (en) * 2007-12-25 2008-06-25 北京科文书业信息技术有限公司 Intelligent error correcting system and method in network searching process
CN104484387A (en) * 2014-12-10 2015-04-01 北京奇虎科技有限公司 Method for carrying out searching in browser and browser device
CN105975639A (en) * 2016-07-04 2016-09-28 北京百度网讯科技有限公司 Search result ordering method and device
CN107784037A (en) * 2016-08-31 2018-03-09 北京搜狗科技发展有限公司 Information processing method and device, the device for information processing
CN109684445A (en) * 2018-11-13 2019-04-26 中国科学院自动化研究所 Colloquial style medical treatment answering method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101206673A (en) * 2007-12-25 2008-06-25 北京科文书业信息技术有限公司 Intelligent error correcting system and method in network searching process
CN104484387A (en) * 2014-12-10 2015-04-01 北京奇虎科技有限公司 Method for carrying out searching in browser and browser device
CN105975639A (en) * 2016-07-04 2016-09-28 北京百度网讯科技有限公司 Search result ordering method and device
CN107784037A (en) * 2016-08-31 2018-03-09 北京搜狗科技发展有限公司 Information processing method and device, the device for information processing
CN109684445A (en) * 2018-11-13 2019-04-26 中国科学院自动化研究所 Colloquial style medical treatment answering method and system

Similar Documents

Publication Publication Date Title
CN110781276B (en) Text extraction method, device, equipment and storage medium
CN107679039B (en) Method and device for determining statement intention
US20190370305A1 (en) Method and apparatus for providing search results
CN109284399B (en) Similarity prediction model training method and device and computer readable storage medium
CN111046656B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN107346336A (en) Information processing method and device based on artificial intelligence
CN106383875A (en) Artificial intelligence-based man-machine interaction method and device
CN111046060A (en) Data retrieval method, device, equipment and medium based on elastic search
CN111292751A (en) Semantic analysis method and device, voice interaction method and device, and electronic equipment
EP3961426A2 (en) Method and apparatus for recommending document, electronic device and medium
CN109582954A (en) Method and apparatus for output information
CN115687572A (en) Data information retrieval method, device, equipment and storage medium
CN109635125B (en) Vocabulary atlas building method and electronic equipment
CN107766498A (en) Method and apparatus for generating information
CN109408175B (en) Real-time interaction method and system in general high-performance deep learning calculation engine
CN113220854B (en) Intelligent dialogue method and device for machine reading and understanding
CN114090792A (en) Document relation extraction method based on comparison learning and related equipment thereof
CN111723235A (en) Music content identification method, device and equipment
CN114842982B (en) Knowledge expression method, device and system for medical information system
CN110688558A (en) Method and device for searching web page, electronic equipment and storage medium
CN110909245A (en) Multi-label webpage searching method, browser, server and storage medium
CN110276001B (en) Checking page identification method and device, computing equipment and medium
CN111460141B (en) Text processing method and device and electronic equipment
CN113434695A (en) Financial event extraction method and device, electronic equipment and storage medium
CN110826313A (en) Information extraction method, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination