CN115237412A

CN115237412A - Corpus index display method and device, equipment, medium and product thereof

Info

Publication number: CN115237412A
Application number: CN202210917023.3A
Authority: CN
Inventors: 邓明宇
Original assignee: Guangzhou Huanju Shidai Information Technology Co Ltd
Current assignee: Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date: 2022-08-01
Filing date: 2022-08-01
Publication date: 2022-10-25

Abstract

The application relates to a corpus index display method, a corpus index display device, equipment, a corpus index display medium and a corpus index display product, wherein the method comprises the following steps of: loading a corpus, wherein the corpus comprises mapping relation data between index values and corpora; responding to the statement selection instruction, and acquiring a corpus formed by corpora in each statement selected by a user from the script file; and responding to a single corpus replacement instruction, determining an index value of each statement in the corpus, and replacing each corpus in the script file with the corresponding index value. The method and the device have the advantages that the corpora in the script file are conveniently subjected to one-key replacement to obtain the corresponding index values, the corpora are convenient to convert among different language versions, and the implementation efficiency of software engineering can be improved.

Description

Corpus index display method and device, equipment, medium and product thereof

Technical Field

The present application relates to the field of software engineering technologies, and in particular, to a corpus index display method, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.

Background

When front-end software project development is carried out in software engineering, if a software project belongs to an internationalized project, multiple language versions are required to be realized, multi-language conversion is required to be realized on a file in the project, namely, language materials corresponding to different languages are required to be provided corresponding to the file adopted by the project, so that research and development personnel accustomed to the different languages can conveniently check the language materials of the different languages, and meanwhile, software versions corresponding to the different languages can be conveniently produced.

In practice, in the code of the script file of the software project, in order to associate the corpora of different languages, the corresponding corpora are usually converted into the corresponding index values to be implanted into the corresponding positions of the code, so that for the computer, the corresponding corpora can be called to complete the encoding or the analysis according to the index values, which is very convenient. However, the linguistic data are represented by the index values, which is not friendly enough to the reading experience of programmers, and the programmers have difficulty in memorizing a plurality of corresponding index values, so that the contents of the file corresponding to each index value can be known only through a complicated query and retrieval process every time when the index values in the script file are faced, which is very inconvenient.

In view of this, for the corpus processing of the multi-language software project, careful design needs to be performed in accordance with a specific development scenario, so as to improve the software engineering implementation efficiency.

Disclosure of Invention

The present application is directed to solve the above-mentioned problems and provides a corpus index display method and a corresponding apparatus, computer device, computer readable storage medium, computer program product,

The technical scheme is adopted to adapt to various purposes of the application as follows:

in one aspect, a corpus index display method is provided for one of the purposes of the present application, including:

loading a corpus, wherein the corpus comprises mapping relation data between index values and corpora;

responding to the statement selection instruction, and acquiring a corpus constituting a corpus set in each statement selected by a user from the script file;

and responding to a single corpus replacement instruction, determining an index value of each statement in the corpus, and replacing each corpus in the script file with the corresponding index value.

Optionally, after loading the corpus, the method includes:

acquiring a target word, wherein the target word is an index value or a corpus;

searching out the corpus and/or index values corresponding to the target words from the corpus, forming a result list and displaying the result list to a graphical user interface;

and responding to an operation event that a user selects any one of the linguistic data or the index value in the result list, and replacing the target word with the selected linguistic data and/or the index value.

Optionally, retrieving the corpus and/or the index value corresponding to the target word from the corpus, and forming a result list to be displayed on a graphical user interface, where the method includes:

judging the target word as an index value or a corpus;

when the target word is an index value, performing rule matching in the corpus, and retrieving a corpus in mapping relation data with the index value being the same as that of the target word to form a first result list;

when the target word is a corpus, performing semantic matching in the corpus, and retrieving mapping relation data of which the corpus is matched with the target word to form a second result list;

and displaying the result list through a suspension layer, wherein index values and/or corpora in each mapping relation data are configured to be selectable items so as to be suitable for being selected by the executed operation event.

Optionally, when the target word is a corpus, performing semantic matching in the corpus, and retrieving mapping relationship data that the corpus and the target word form semantic matching to form a second result list, where the mapping relationship data includes:

when the target word is a corpus, extracting a text vector of the target word by adopting a feature extractor pre-trained to a convergence state;

calculating text similarity between a text vector of a target word and text vectors of corpora in each mapping relation data in the corpus, wherein the text vectors of the corpora in the corpus are extracted in advance by the feature extractor;

and selecting partial mapping relation data with higher text similarity to form a second result list.

Optionally, the obtaining the target word includes:

responding to a user search instruction, and acquiring a text character string input by a user in a search box as a target word;

or,

and responding to a mouse hovering instruction, and determining a corpus pointed by the mouse as a target word according to the position of the mouse.

Optionally, loading the corpus includes:

responding to a corpus updating notice of a server, and sending a corpus updating request to the server, wherein the request contains language information corresponding to the corpus;

acquiring a language database which is pushed by a server in response to the language database updating request and corresponds to the language information;

and replacing the historical corpus with the corpus to complete loading of the corpus.

In another aspect, a corpus index display device adapted to one of the objectives of the present application is provided, including: the corpus loading module is used for loading a corpus, and the corpus comprises mapping relation data between index values and corpora; the statement selection module is used for responding to statement selection instructions and acquiring corpora in each statement selected by a user from the script file to form a corpus set; and the one-key execution module is used for responding to a single corpus replacement instruction, determining the index value of each statement in the corpus, and replacing each corpus in the script file with the corresponding index value.

Optionally, the post-corpus loading module includes: the word-taking submodule is used for obtaining a target word, and the target word is an index value or a corpus; the retrieval submodule is used for retrieving the corpus and/or the index value corresponding to the target word from the corpus, forming a result list and displaying the result list to a graphical user interface; and the single item replacement submodule is used for responding to an operation event that a user selects any one of the linguistic data or the index value in the result list and replacing the target word with the selected linguistic data and/or the index value.

Optionally, the retrieval sub-module includes: the object judgment unit is used for judging the target word as an index value or a corpus; the first matching unit is used for executing rule matching in the corpus when the target word is an index value, and searching out the corpus in the mapping relation data with the index value being the same as that of the target word to form a first result list; the second matching unit is used for executing semantic matching in the corpus when the target word is a corpus, and retrieving mapping relation data which is formed by the corpus and the target word and is matched with the semantic to form a second result list; and the result display unit is used for displaying the result list through the suspension layer, wherein the index value and/or the corpus in each mapping relation data are configured into selectable items so that the selectable items are suitable for being selected by the executed operation event.

Optionally, the second matching unit includes: the vector generation subunit is used for extracting the text vector of the target word by adopting a feature extractor pre-trained to a convergence state when the target word is a corpus; the similarity calculation subunit is configured to calculate a text similarity between a text vector of a target word and a text vector of a corpus in each mapping relationship data in the corpus, where the text vector of each corpus in the corpus is extracted in advance by the feature extractor; and the structure construction subunit is used for selecting part of the mapping relation data with higher text similarity to form a second result list.

Optionally, the word fetching submodule includes: the search word-taking submodule is used for responding to a user search instruction and obtaining a text character string input by a user in a search box as a target word; or the hovering word-fetching submodule is used for responding to a mouse hovering instruction and determining the corpus pointed by the mouse as the target word according to the position of the mouse.

Optionally, the corpus loading module includes: the request updating submodule is used for responding to a corpus updating notice of the server and sending a corpus updating request to the server, wherein the request updating request comprises language information corresponding to the corpus; the updating acquisition submodule is used for acquiring a language base which is pushed by the server in response to the language base updating request and corresponds to the language information; and the corpus upgrading submodule is used for replacing the historical corpus with the corpus to finish loading the corpus.

In another aspect, a computer device adapted to one of the objectives of the present application includes a central processing unit and a memory, wherein the central processing unit is configured to call and run a computer program stored in the memory to perform the steps of the corpus index presentation method described in the present application.

In another aspect, a computer-readable storage medium is provided, which stores a computer program implemented according to the corpus index presentation method in the form of computer-readable instructions, and when the computer program is called by a computer, executes the steps included in the method.

In another aspect, a computer program product is provided to adapt to another object of the present application, and includes a computer program/instructions, which when executed by a processor, implement the steps of the corpus index presentation method described in any one of the embodiments of the present application.

The present application has various advantages over the prior art, including but not limited to: after loading the corpus, the corpus is allowed to be selected from the script file to form a corpus set, then a single corpus replacement instruction is given, according to each sentence in the corpus set, the corresponding index value of the corpus is determined, each corresponding sentence in the script file is replaced by the corresponding index value, one-key replacement of the corpus is achieved, corresponding versions of different languages of the corpus in the script file are conveniently produced through the index values, conversion of versions of different languages is conveniently carried out, operation convenience is improved, and development efficiency of software engineering can be accelerated.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a flowchart illustrating an embodiment of a corpus index display method according to the present application.

Fig. 2 is a schematic flow chart illustrating implementation of human-computer interaction according to a target word display result list in the embodiment of the present application.

Fig. 3 is a schematic flow chart illustrating the determination of the second result list by semantic matching in the embodiment of the present application.

Fig. 4 is a schematic flow chart illustrating a process of implementing development branch merging by performing human-machine interaction according to verification configuration information in the embodiment of the present application.

FIG. 5 is a schematic flowchart of loading a corpus in an embodiment of the present application.

FIG. 6 is a schematic block diagram of a corpus index presentation device according to the present application;

fig. 7 is a schematic structural diagram of a computer device used in the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and performs remote invocation at a client, and can also be deployed in a client with sufficient equipment capability to perform direct invocation.

The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations to this.

The corpus index display method can be programmed into a computer program product and is deployed in a client or a server to run. Referring to fig. 1, in an embodiment of the corpus index display method of the present application, the method includes the following steps:

step S1100, loading a corpus, wherein the corpus comprises mapping relation data between index values and corpora;

in an exemplary application scenario, the software project of the present application may be developed in a VSCode code encoder by using JavaScript as a programming scripting language, and then run in a NodeJS running environment provided by a ChromeV8 engine.

The language material to be implanted in the script language can be obtained from a language material library, and the language material library can be stored in a multi-language platform and downloaded by providing a network address of the multi-language platform. Each corpus is provided with a corresponding index value, each index value is represented by a unique characteristic, and the corpus corresponding to each index value can be uniquely determined through one index value. In the multi-language platform, corpora corresponding to different languages can be configured corresponding to each index value to represent the same document content.

The corpus may be a corpus of a corresponding language version acquired in accordance with a default language set by a local development end, and the corpus includes mapping relationship data between an index value and a corpus expressed in the default language, for example, the default language is set as chinese at the local development end, and then a request is made to download a chinese version corpus from a server on which the multilingual platform is located, so that the acquired corpus is stored in a local storage device for calling.

Step S1200, responding to a statement selection instruction, and acquiring a corpus forming corpus set of each statement selected by a user from a script file;

in an editor for coding a script file, for example, the VSCode encoder may implement, through an extension plug-in, a switching control for entering a selected statement mode, and when a user opens a script file and operates the switching control based on the script file, the selected statement mode may be entered.

Each line of statements in the script file may contain corpora expressed in a default language, which is set according to the actual situation of the developer writing the code. In one embodiment, after entering the select statement mode, a check box may be provided on the left or right side of each statement for checking the corresponding statement.

And additionally arranging a confirmation control in the development interface, triggering a confirmation instruction after the development user selects one or more sentences in the script file and operates the confirmation control, and constructing each selected sentence into a corpus. It is understood that the sentences in the corpus may include the corpus expressed in the default language or may be identified subsequently without the corpus.

In one embodiment, when the check box is displayed for each sentence, it may be checked whether each sentence in the script file contains a corpus, for example, whether a character string retention symbol (e.g., a double quotation mark) exists in one sentence, and the character string marked by the retention symbol is used as the corpus. For the statement with the language material, the check box of the statement with the line is set to be in a selectable activated state, and the check box of the statement without the language material is set to be in a non-selectable deactivated state.

Step 1300, responding to a single corpus replacement instruction, determining an index value of each statement in the corpus, and replacing each corpus in the script file with the corresponding index value.

After the development user determines the corpus set, further, a replacement control is opened in a development interface, the replacement control is implemented to trigger a single corpus replacement instruction, all corpora in the corpus set are replaced, and corresponding corpora in the script file are replaced by corresponding index values.

When the corpus replacement instruction is triggered, corresponding data records are searched out from the corpus in a one-to-one correspondence mode according to the corpuses in the corpus, and each data record stores mapping relation data between the corresponding corpus and an index value of the corresponding corpus, so that the mapping relation data of the corpuses in the statement set can be obtained practically and can be used for determining the corresponding index value according to the corpuses.

And for each corresponding statement selected in the script file, determining each corpus as a corresponding index value one by one for the corpus in each statement, and replacing the index values of the corresponding corpuses in the script file based on all corpuses in the corpus set, so that the corpuses in all the selected statements in the script file can be replaced.

In the process of implementing the corpus and index value replacement, the corpus set is referred as an intermediate medium instead of directly querying the corpus, the fact that the corpus set can play a role in duplicate removal is mainly considered, for the situation that a plurality of sentences use the same corpus, the sentences of the same corpus are regarded as the same processing object through the corpus set, a data record is determined correspondingly in the follow-up process, the data record can serve for the replacement of the corpus in the plurality of sentences, and the processing efficiency is improved.

Therefore, the corpora in the script file are replaced by responding to the single corpus replacement instruction, and the corpora in each selected sentence are replaced by the corresponding index values, so that the corpora in all the selected sentences can be replaced by one key, the situation that developers individually set the index values is avoided, and the programming efficiency is greatly improved.

From the above embodiments, it will be appreciated that the present application has a number of advantages, including but not limited to: after loading the corpus, the corpus is allowed to be selected from the script file to form a corpus set, then a single corpus replacement instruction is given, according to each sentence in the corpus set, the corresponding index value of the corpus is determined, each corresponding sentence in the script file is replaced by the corresponding index value, one-key replacement of the corpus is achieved, corresponding versions of different languages of the corpus in the script file are conveniently produced through the index values, conversion of versions of different languages is conveniently carried out, operation convenience is improved, and development efficiency of software engineering can be accelerated.

The developer can obtain the index value and/or the corpus required by the script file by providing the target word as a query object, and the index value and/or the corpus can be used as an information prompt or an option. To this end, referring to fig. 2, based on any embodiment of the present application, after loading a corpus, the method includes:

step S2100, obtaining a target word, wherein the target word is an index value or a corpus;

the content of the target word may be an index value so as to retrieve a corpus corresponding to the target word from the corpus according to the index value and display the corpus on a screen for prompting, or a corpus used for searching a plurality of mapping relation coefficient data similar to the corpus so as to select the index value or the corpus by a development user or only play a prompting role.

The source of the target word can be obtained by word extraction from the script file or input by a developer in a correspondingly provided search box.

In one embodiment, this step includes performing the steps of: and responding to a user search instruction, and acquiring a text character string input in a search box by the user as a target word. Specifically, by means of the plug-in expanded by the code encoder of the application, a search box is provided in a development interface, and a development user inputs a text character string for searching the corpus into the search box, so that the corpus to be searched can be regarded. The expansion plug-in is responsible for segmenting words of the text character strings input into the search box to obtain each word element, any one or any combination of the word elements can be regarded as a target word to be treated, and searching of the linguistic data is executed by means of combination of the word elements or any combination of the word elements.

In another embodiment, the method includes performing the steps of: and responding to the mouse hovering instruction, and determining the corpus pointed by the mouse as the target word according to the mouse position. Specifically, in one mode, mouse position information is obtained by monitoring a mouse hovering instruction, then a statement corresponding to the mouse position is obtained according to the mouse position information, and then a corpus pointed by the mouse position is determined from the statement, where the corpus is usually a corpus pointed by reserved characters, and the corpus can be used as a target word, and the whole corpus is used as the target word to facilitate subsequent accurate matching of an index value of the target word. In another mode, a word element pointed by the mouse can be determined according to the position of the mouse, the word element is used as a target word, a single word element is used as a target word, and a plurality of mapping relation data can be matched from a corpus more conveniently through fuzzy search.

As an auxiliary means, in an embodiment, after the corresponding target word is determined in response to the mouse hovering instruction, the determined target word in the script file may be highlighted, so that a development user may conveniently and quickly know the selected target word.

In another embodiment, an adjusting control for expanding the length of the selected character string is provided in the highlighted target word, so that the development user can freely change the text range covered by the whole adjusting control at the left side and the right side to adjust the specific text forming the target word as required.

Step S2200, searching out the corpus and/or index value corresponding to the target word from the corpus, forming a result list and displaying the result list to a graphical user interface;

the target word determined according to the previous step may be retrieved in the corpus in order to obtain a result list.

In one embodiment, the target word is an index value, and since there is usually a one-to-one correspondence between the index value and a corpus, when a development user inputs the target word from a search box or hovers over a mouse as the index value, the development user understands the target word as the corpus to be used for querying the corpus corresponding to the index value, and accordingly, only by adopting an accurate matching manner, mapping relationship data corresponding to the target word (index value) is queried in the corpus, and then the corpus is obtained to obtain a result list, which is actually used only for displaying the accurately matched mapping relationship data, that is, data pairs between the index value and the corpus thereof, or only the corpus corresponding to the index value in the mapping relationship data may be included.

In another embodiment, the target word is a corpus, and the corpus may only include a single morpheme or a complete string in a complete corpus. When the user inputs a text character string from the search box, or hovers over a mouse to select the corpus or the word elements in the corpus in the script file, so as to determine the target word, the user can understand the target word that the development user wishes to provide to fuzzily determine the related corpus and/or the index value thereof close to the meaning of the target word, or can understand the target word that the development user wishes to find out the index value of the corpus corresponding to the target word, so that the development user can perform corresponding editing operation according to the returned result. For a target word belonging to a corpus provided by a user, similarly, fuzzy matching can be performed in the corpus to retrieve mapping relationship data in which the corpus is the same as or similar to the target word, so as to form a result list, and in the result list, an index value corresponding to the corpus of the target word can be provided separately, or a data pair between the corpus and the index value matched with the corpus of the target word can be provided.

The result list obtained in any way can be formatted and displayed in the graphical user interface where the script file is located, so that the development user can select an object to be adopted from the result list, and insert or replace the object to a specified position of the script text, such as to a current cursor position, or replace a highlighted target word and the like.

In one embodiment, corresponding to the situation that the target word is a corpus, performing semantic matching in the corpus, retrieving mapping relationship data matched with the target word in a semantic manner, then constructing a result list, listing index values and corresponding corpora in each mapping relationship data in the process of formatting and displaying the result list, and configuring each index value and each corpora as selectable items so as to allow a development user to click any one of the index values and the corpora, and inserting the selected index value or corpora into the corresponding position of the script file to realize insertion or replacement.

Step S2300, responding to an operation event that the user selects any one of the corpuses or index values in the result list, and replacing the target word with the selected corpuses and/or index values.

As described above, whether the result list includes a single index value or mapping relationship data between index values and corpora, the index values and corpora can be configured as selectable items, so that a user can select any one of the corpora or the index values, and the selected corpora or the index values are inserted into the position of the cursor in the script file, or replace the highlighted target word, thereby realizing the insertion of the corpora or the index values in the search result according to the target word.

According to the embodiment, it is easy to understand that in the present application, a plurality of ways including a search box input way or a mouse hovering selection way can be adopted to compatibly understand the user input, and then the method adapts to inputs in different ways to flexibly understand the user intention, and corresponding results can be retrieved according to different user search intentions, so that not only can the corpus be retrieved by using the index value to play a role in prompting through a result list, but also the corresponding index value or mapping relation data of other similar corpuses and index values can be retrieved by using the corpus in a fuzzy or precise manner, and displayed for a development user in a result list form to select, and after the development user selects one of the corpus, the selected result can be inserted into a script file, so that editing of the corpus and the index values in a script text can be conveniently and rapidly realized, and the code writing efficiency is greatly improved.

On the basis of any embodiment of the present application, please refer to fig. 3, where the corpus is searched for the corpus and/or the index value corresponding to the target word, and a result list is formed and displayed on the graphical user interface, including:

step S2210, judging the target word as an index value or a corpus;

the index value and the corpus are respectively characterized, for example, the index value is usually in a numerical form, and the corpus is usually in a text form, for this purpose, the index value or the corpus can be determined by judging the data type of the target word, and then corresponding processing is performed on the index value or the corpus respectively.

Step S2220, when the target word is an index value, executing rule matching in the corpus, and retrieving the corpus in the mapping relation data with the same index value as the target word to form a first result list;

when the target word is determined to be the index value, it usually indicates that the user wants to check the corpus corresponding to the specific index value, so that rule matching, specifically, an accurate matching manner, may be performed on the corpus, mapping relationship data with an index value equal to the target word, usually a single data record, is retrieved from the corpus, and then the corpus in the data record is obtained as the first result list. It will be appreciated that in practice the first list of results may contain a single corpus.

Step S2230, when the target word is a corpus, performing semantic matching in the corpus, and retrieving mapping relation data of the corpus and the target word which form semantic matching to form a second result list;

when the target word is determined to be a corpus, since the corpus includes semantics, the corpus may be input through a search box or obtained by fetching a word through a mouse, which usually indicates that a user wishes to obtain similar corpora and/or index values for the target word, or obtain index values of corpora corresponding to the target word.

In order to be compatible with multiple query intentions of the user, in this embodiment, semantic matching is performed on the target word and corpora in each data record in the corpus one by one, and mapping relationship data that achieves semantic matching with the target word are all screened out to form a second result list. It is understood that the index value and the corresponding corpus in the mapping relationship data may be provided in the second result list at the same time, so that the user may select not only a certain index value but also a certain corpus through the second result list.

Step S2240, displaying the result list through the floating layer, where the index value and/or corpus in each mapping relationship data is configured as a selectable item, so that it is suitable for being selected by the executed operation event.

Whether the obtained result list is the first result list or the second result list can be displayed in the graphical user interface where the script file is located through unified formatting and displaying operation. Specifically, in this embodiment, a floating layer is created, a list control is added to the floating layer, and then the result list is displayed in the list control.

In order to facilitate development of index values or corpora in a result list selected by a user, when the result list is formatted and displayed, each index value and each corpora in a list control are configured to be selectable, so that the selectable items can be selected in response to an operation event corresponding to user touch, and then correspond to the position of a cursor inserted into a script text or replace a highlighted target word.

According to the embodiment, it is easy to understand that the development user can be compatible with various operation intentions of the user by providing the corresponding target words without clearly expressing the operation intentions of the development user, and correspondingly determining the linguistic data corresponding to the target words to play a role in prompting; or determining the corpus and index value corresponding to the target word, and providing the corpus and index value for the user to call or at least playing a role in prompting. Therefore, the method and the device provide greater operation convenience for the development process of the software codes, the development user is not required to implement multi-level operation, the required result can be directly reached, various linguistic data and index values thereof can be rapidly edited in the code file when a multi-language software project is processed, mutual replacement can be carried out at any time to overcome the obstacle caused by multi-language versions, and the software engineering implementation efficiency is greatly improved.

On the basis of any embodiment of the present application, referring to fig. 4, when a target word is a corpus, performing semantic matching in the corpus, and retrieving mapping relationship data that the corpus and the target word form semantic matching to form a second result list, including:

step S2231, when the target word is a corpus, extracting the text vector of the target word by adopting a feature extractor pre-trained to a convergence state;

when the target word is a corpus, the semantic matching operation based on the target word can be realized by means of a feature extractor pre-trained to a convergence state.

The feature extractor may be a text feature extractor, including but not limited to a deep learning model implemented by using a recurrent neural network such as LSTM, BERT, and the like, and may be used as the text feature extractor. Certainly, the text feature extractor needs to access the classifier first, and adopts a sufficient amount of training samples to train the text feature extractor to a convergence state, so that the text feature extractor is put into use after learning the capability of performing feature representation on text information.

For the target word, after the target word is embedded into the word to obtain a corresponding word vector, the target word is input into the text feature extraction model to extract corresponding deep semantic information of the target word to obtain a corresponding text vector.

Step S2232, calculating text similarity between the text vector of the target word and the text vector of the corpus in each mapping relation data in the corpus, wherein the text vector of each corpus in the corpus is extracted and obtained by the feature extractor in advance;

similarly, for the corpora in each mapping relation data in the corpus, the feature extractor described in yumei can extract deep semantic information one by one to obtain corresponding text vectors, and the text vectors are stored in the corpus for later use.

Accordingly, the data distance between the text vector of the target word and the text vector of the corpus in each mapping relationship data in the corpus can be calculated as the text similarity. The algorithm used for calculating the data distance can be any one of a cosine similarity algorithm, an Euclidean distance algorithm, a Pearson correlation coefficient algorithm, an Jacard coefficient algorithm and the like. After the data distance is calculated, the data distance is normalized to a specific numerical space, for example, a numerical space of [0,1], so that a higher numerical value indicates that the corresponding corpus is more similar to the target word, and the higher numerical value can be used as the text similarity.

And step S2233, selecting the part of the mapping relation data with higher text similarity to form a second result list.

After each corpus in the corpus obtains the text similarity between each corpus and the target word, in one embodiment, the data records in the corpus are inversely ordered according to the text similarity, and then a plurality of data records with the top order are selected according to the preset number, so that the mapping relation data corresponding to the data records can be constructed into a second result list.

In another embodiment, the text similarity of each corpus in the corpus may be screened according to a preset threshold, where the preset threshold may be an empirical threshold, and the mapping relationship data of each corpus with the text similarity higher than the preset threshold is screened, so as to construct the second result list.

According to the embodiment, it is easy to understand that the deep semantic information of the target word and the corpus is obtained by performing semantic matching based on the deep semantic information of each corpus of the target word and the corpus to obtain the second result list, and the deep semantic information of the target word and the corpus is obtained by means of the feature extractor, and the text similarity between the deep semantic information and the target word is calculated on the basis of the deep semantic information, so that the matching relationship between the target word and the corpus can be more accurately represented, the result information contained in the second result list can better meet the requirements of development users, effective index information is provided for the development users, index values or corpuses in the second result list can be conveniently and quickly selected, and the code compiling efficiency of the development users is improved.

Based on any embodiment of the present application, loading a corpus, please refer to fig. 5, including:

step S1110, responding to a corpus update notification from the server, and sending a corpus update request to the server, where the request includes language information corresponding to the corpus;

when a language database in a server of a multi-language development platform is updated, an update notice can be generated and broadcasted to terminal equipment of each development user, and after the current terminal equipment receives the update notice, a language database update request can be sent to the server, and the update request contains default language set by a local computer to provide language information, so that the server can obtain the language information to provide an update version of the corresponding language database.

Specifically, the server responds to the update request, determines the latest version of the corpus of the corresponding language version according to the language information in the update request, and then pushes the latest version of the corpus to the current terminal device.

Step S1120, obtaining a corpus corresponding to the language information, pushed by the server in response to the corpus update request;

the current terminal device starts to download the corpus of the latest version pushed by the server and downloads the corpus to the local. The download process can support breakpoint resuming.

Step S1130, the corpus replaces the historical corpus to complete loading of the corpus.

After finishing the corpus of the latest version, the corpus can replace the historical corpus to keep the historical corpus as the corpus corresponding to the default language, so that loading of the corpus required by the application is finished.

According to the embodiment, the method and the device can ensure that the latest version of the corpus is always used for providing query index service related to the corpus for the writing process of the script text, so that collaborative development among different development users is kept, and the phenomenon that index values are quoted or the corpus is wrong due to different versions of the corpus among the users is avoided.

Referring to fig. 6, a corpus index display apparatus adapted to one of the purposes of the present application is provided, which embodies the functionality of the corpus index display method of the present application, and the apparatus includes: a corpus loading module 1100, configured to load a corpus, where the corpus includes mapping relationship data between index values and corpuses; a statement selecting module 1200, configured to respond to a statement selecting instruction, and obtain a corpus constituting a corpus set of the statements selected by the user from the script file; a key execution module 1300, configured to determine, in response to a single corpus replacement instruction, an index value of each statement in the corpus, and replace each corpus in the script file with its corresponding index value.

On the basis of any embodiment of the present application, the post corpus loading module 1100 includes: the word-taking submodule is used for obtaining a target word, and the target word is an index value or a corpus; the retrieval submodule is used for retrieving the corpus and/or the index value corresponding to the target word from the corpus, forming a result list and displaying the result list to a graphical user interface; and the single item replacing submodule is used for responding to an operation event that a user selects any one of the linguistic data or the index value in the result list and replacing the target word with the selected linguistic data and/or the index value.

On the basis of any embodiment of the present application, the search sub-module includes: the object judgment unit is used for judging the target word as an index value or a corpus; the first matching unit is used for executing rule matching in the corpus when the target word is an index value, and searching out the corpus in the mapping relation data with the index value being the same as that of the target word to form a first result list; the second matching unit is used for executing semantic matching in the corpus when the target word is a corpus, and retrieving mapping relation data which is formed by the corpus and the target word and is matched with the semantic to form a second result list; and the result display unit is used for displaying the result list through a suspension layer, and index values and/or corpora in each mapping relation data are configured to be selectable so as to be suitable for being selected by the executed operation event.

On the basis of any embodiment of the present application, the second matching unit includes: the vector generation subunit is used for extracting the text vector of the target word by adopting a feature extractor which is pre-trained to a convergence state when the target word is a corpus; the similarity calculation subunit is configured to calculate a text similarity between a text vector of a target word and a text vector of a corpus in each mapping relationship data in the corpus, where the text vector of each corpus in the corpus is obtained by pre-extraction by the feature extractor; and the structure construction subunit is used for selecting part of the mapping relation data with higher text similarity to form a second result list.

On the basis of any embodiment of the present application, the word fetching submodule includes: the search word-taking submodule is used for responding to a user search instruction and acquiring a text character string input by a user in a search box as a target word; or the hovering word-fetching submodule is used for responding to a mouse hovering instruction and determining the corpus pointed by the mouse as the target word according to the position of the mouse.

On the basis of any embodiment of the present application, the corpus loading module 1100 includes: the request updating submodule is used for responding to a corpus updating notice of the server and sending a corpus updating request to the server, wherein the request updating request comprises language information corresponding to the corpus; the updating and acquiring submodule is used for acquiring a language database which is pushed by the server in response to the language database updating request and corresponds to the language information; and the corpus upgrading submodule is used for replacing the historical corpus with the corpus to finish loading the corpus.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 7, the computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions, when executed by the processor, can enable the processor to realize a commodity search category identification method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may store computer readable instructions, and when the computer readable instructions are executed by the processor, the processor may execute the corpus index presentation method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 6, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data required for executing all modules/sub-modules in the corpus index presentation device of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.

The present application further provides a storage medium storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the steps of the corpus index presentation method according to any embodiment of the present application.

The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), or other computer readable storage medium, or a Random Access Memory (RAM).

In conclusion, the method and the device facilitate one-key replacement of the corpus in the script file into the corresponding index value, facilitate conversion of the corpus among different language versions, and can improve the implementation efficiency of software engineering.

Claims

1. A corpus index display method is characterized by comprising the following steps:

2. The corpus index presentation method according to claim 1, wherein after loading the corpus, the method comprises:

acquiring a target word, wherein the target word is an index value or a corpus;

3. The corpus index presentation method according to claim 2, wherein the corpus and/or the index value corresponding to the target word are retrieved from the corpus to form a result list to be displayed on a graphical user interface, and the method comprises:

judging that the target word is an index value or a corpus;

when the target word is a corpus, performing semantic matching in the corpus, and retrieving mapping relation data of which the corpus and the target word form semantic matching to form a second result list;

4. The corpus index presentation method according to claim 3, wherein when the target word is a corpus, performing semantic matching in the corpus, retrieving mapping relationship data that semantically matches the corpus and the target word to form a second result list, comprising:

calculating the text similarity between the text vector of the target word and the text vectors of the corpora in each mapping relation data in the corpus, wherein the text vectors of each corpus in the corpus are extracted and obtained in advance by the feature extractor;

5. The corpus index presentation method of claim 2, wherein the obtaining of the target word comprises:

responding to a user search instruction, and acquiring a text character string input in a search box by a user as a target word;

or,

and responding to the mouse hovering instruction, and determining the corpus pointed by the mouse as the target word according to the mouse position.

6. The corpus index presentation method according to any one of claims 1 to 5, wherein loading the corpus comprises:

responding to a corpus updating notice of a server, and sending a corpus updating request to the server, wherein the corpus updating request comprises language information corresponding to the corpus;

7. A corpus index display device, comprising:

the corpus loading module is used for loading a corpus, and the corpus comprises mapping relation data between index values and corpora;

the statement selection module is used for responding to statement selection instructions and acquiring corpora in each statement selected by a user from the script file to form a corpus set;

and the one-key execution module is used for responding to a single corpus replacement instruction, determining an index value of each statement in the corpus, and replacing each corpus in the script file with the corresponding index value.

8. A computer device comprising a central processing unit and a memory, characterized in that the central processing unit is adapted to invoke the execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 6.

9. A computer-readable storage medium, characterized in that it stores a computer program implemented according to the method of any one of claims 1 to 6 in the form of computer-readable instructions, which, when invoked by a computer, performs the steps comprised by the corresponding method.

10. A computer program product comprising computer programs/instructions which, when executed by a processor, carry out the steps of the method of any one of claims 1 to 6.