CN107862028B

CN107862028B - Method for establishing standard academic model, server and storage medium

Info

Publication number: CN107862028B
Application number: CN201711055085.3A
Authority: CN
Inventors: 刘娜
Original assignee: Hubei Sanxin Cultural Media Co ltd
Current assignee: Hubei Sanxin Cultural Media Co ltd
Priority date: 2017-10-27
Filing date: 2017-10-27
Publication date: 2021-01-05
Anticipated expiration: 2037-10-27
Also published as: CN107862028A

Abstract

The invention discloses a method for establishing a standard academic model, a server and a storage medium, wherein the method comprises the steps of obtaining object data in a document to be analyzed through the server, converting the object data into a plurality of metadata in a standard format according to a preset conversion rule, establishing indexes corresponding to the metadata, obtaining the incidence relation among the indexes, establishing the standard academic model according to the metadata, the indexes and the incidence relation, unifying the formats of various object data into the standard format, reducing the pressure of a processor when searching corresponding data, accelerating the retrieval process, improving the retrieval speed and efficiency, and enabling the retrieval result to be more comprehensive and complete and improving the user experience through establishing the incidence relation among the indexes.

Description

Method for establishing standard academic model, server and storage medium

Technical Field

The invention relates to the field of computer information search, in particular to a method for establishing a standard academic model, a server and a storage medium.

Background

With the rapid growth in the amount of information, many users have no choice in the face of countless information systems, numerous heterogeneous interfaces, and organization databases that vary in content. To address this problem, google was introduced in 1998, attracting countless users with a unique information portal. Meanwhile, in academic institutions and academic circles, other information discovery and transmission systems begin to appear, including institution resource libraries, course management, electronic storage, digital collection management systems and the like, and the systems provide more additional channels for storing, discovering and transmitting information for the academic institutions and institutions at that time; subsequently, the academic institution system imitates the google search engine, develops and releases a federal retrieval scheme, and the federal retrieval can simultaneously search, retrieve and fully display information contents from different remote institutions, however, the federal retrieval system is a divide-and-conquer heterogeneous retrieval, the retrieval effect depends on the self-owned function of each database system, and the defects of the retrieval speed, the deduplication of the retrieval results, the sequencing and the like are difficult to overcome. Such as high cost, slow retrieval speed, complex usage method, and the inability of its technology to fully integrate with personal subscription data. In the prior art, an academic institution system begins to evolve to a 'next generation directory', and the 'next generation directory' can enable a terminal user to realize application experiences such as marking, list creation, book comment addition, website link and the like with Web2.0 interaction characteristics on a retrieval interface provided by an academic institution. Although the interface has many breakthroughs and innovations, the interface is still limited in the range of traditional book and periodical resources of academic institutions and local self-built digital resources. Meanwhile, an Open Public Access directory (OPAC) system, a resource navigation system, a link server, a cross-library retrieval system, and the like of an academic institution also bring a series of inconveniences to organization and acquisition of resources, and because uniform disclosure between electronic resources and entity resources cannot be realized, a retrieval function is lacked, and retrieval results are insufficient in aspects of deduplication, sequencing, and the like, which may cause a problem of insufficient retrieval speed.

Disclosure of Invention

The invention mainly aims to provide a method, a server and a storage medium for establishing a standard academic model, and aims to solve the technical problems that in the prior art, because unified disclosure between electronic resources and entity resources cannot be realized, a retrieval function is lacked, retrieval results are insufficient in aspects of duplicate removal, sequencing and the like, and the retrieval speed is not high enough.

In order to achieve the above object, the present invention provides a method for establishing a standard academic model, which comprises the following steps:

the method comprises the steps that a server obtains object data in a document to be analyzed, and the object data are converted into a plurality of metadata in a standard format according to a preset conversion rule;

establishing indexes corresponding to the metadata, and acquiring association relations among the indexes;

and establishing a standard academic model according to the metadata, the indexes and the association relation.

Preferably, the server acquires object data in a document to be analyzed, and converts the object data into a plurality of metadata in a standard format according to a preset conversion rule, and specifically includes:

analyzing the document to be analyzed and acquiring an analysis result, and extracting target data which accords with a preset extraction rule from the analysis result;

integrating the target data and the document information corresponding to the target data to generate an integration result, and taking the integration result as the object data;

and converting the object data into a plurality of metadata in a standard format according to the preset conversion rule.

Preferably, before the converting the object data into the metadata in a standard format according to the preset conversion rule, the method for establishing a standard academic model further includes:

and acquiring the data attribute of the object data, and inquiring the preset conversion rule which accords with the data attribute according to a preset mapping relation.

Preferably, the establishing of the indexes corresponding to the metadata and the obtaining of the association relationship between the indexes specifically include:

searching original document information corresponding to the metadata according to the metadata;

determining the index according to the original literature information;

and comparing the indexes to generate a comparison result, and determining the association relation among the indexes according to the comparison result.

Preferably, the determining the index according to the original literature information specifically includes:

extracting document contents and machine-readable catalogues corresponding to the metadata from the original document information;

the index is determined from the document content and the machine-readable catalog.

Preferably, the establishing a standard academic model according to the metadata, the index and the association relation specifically includes:

arranging the metadata, the indexes and the association relation according to a preset relevance arrangement rule, associating the metadata, the indexes and the association relation with one another, and generating an association result;

and establishing the standard academic model according to the correlation result.

Preferably, after the standard academic model is built according to the metadata, the index and the association relationship, the method for building the standard academic model comprises the following steps:

and displaying the retrieval result obtained through the standard academic model according to a preset display rule.

Preferably, before the server obtains the object data in the document to be analyzed and converts the object data into a plurality of metadata in a standard format according to a preset conversion rule, the method for establishing a standard academic model further includes:

screening documents in a local database and documents in a remote database according to a preset document standard;

and taking the document meeting the preset document standard as the document to be analyzed.

In addition, to achieve the above object, the present invention further provides a server, including: a memory, a processor, and a standard academic model building program stored on the memory and executable on the processor, the standard academic model building program configured to implement the steps of the standard academic model building method as described above.

In addition, to achieve the above object, the present invention further provides a storage medium having a standard academic model building program stored thereon, wherein the standard academic model building program, when executed by a processor, implements the steps of the standard academic model building method as described above.

The invention provides a method for establishing a standard academic model, which comprises the steps of obtaining object data in documents to be analyzed through a server, converting the object data into a plurality of metadata in a standard format according to a preset conversion rule, establishing indexes corresponding to the metadata, obtaining association relations among the indexes, establishing the standard academic model according to the metadata, the indexes and the association relations, unifying formats of various object data into the standard format, reducing pressure of a processor when searching corresponding data, accelerating a retrieval process, improving retrieval speed and efficiency, and establishing the association relations among the indexes, so that retrieval results are more comprehensive and complete, and user experience is improved.

Drawings

FIG. 1 is a schematic diagram of a server architecture of a hardware operating environment according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of a first embodiment of a method for building a standard academic model according to the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a method for building a standard academic model according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The solution of the embodiment of the invention is mainly as follows: the invention obtains the object data in the document to be analyzed through the server, converts the object data into a plurality of metadata with standard formats according to the preset conversion rule, establishes indexes corresponding to the metadata, acquires the association relationship among the indexes, establishes a standard academic model according to the metadata, the indexes and the association relationship, can lighten the pressure of a processor when searching corresponding data, quickens the searching process, improves the searching speed and efficiency by unifying the formats of various object data into the standard format, can lead the searching result to be more comprehensive and complete by establishing the association relationship among the indexes, improves the user experience, solves the defects of lacking of searching function, searching result in aspects of duplicate removal, sequencing and the like caused by incapability of realizing unified disclosure between electronic resources and entity resources in the prior art, the technical problem that the retrieval speed is not fast enough can be caused.

Referring to fig. 1, fig. 1 is a schematic diagram of a server structure of a hardware operating environment according to an embodiment of the present invention.

As shown in fig. 1, the server may include: a processor 1001, such as a CPU, a communication bus 1002, a user side interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the server architecture shown in FIG. 1 is not intended to be limiting of the server, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, the memory 1005, which is a kind of computer storage medium, may include an operating system, a network communication module, a client interface module, and a program for establishing a standard academic model.

The server of the present invention calls the program for establishing a standard academic model stored in the memory 1005 through the processor 1001, and executes the following operations:

Further, the processor 1001 may call a program for building a standard academic model stored in the memory 1005, and further perform the following operations:

determining the index according to the original literature information;

According to the technical scheme, the object data in the document to be analyzed is obtained through the server, the object data are converted into the metadata in the standard format according to the preset conversion rule, the indexes corresponding to the metadata are established, the association relation among the indexes is obtained, the standard academic model is established according to the metadata, the indexes and the association relation, the formats of various object data are unified into the standard format, the pressure of a processor in searching corresponding data can be relieved, the searching process is accelerated, the searching speed and efficiency are improved, the retrieval result can be more comprehensive and complete through establishing the association relation among the indexes, and the user experience is improved.

Based on the hardware structure, the embodiment of the method for establishing the standard academic model is provided.

Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the method for establishing a standard academic model according to the present invention.

In a first embodiment, the method of building a standard academic model includes the steps of:

step S10, the server acquires object data in the document to be analyzed, and converts the object data into a plurality of metadata in a standard format according to a preset conversion rule;

it should be noted that the basic document to be analyzed, which is prepared in advance for establishing the standard academic model, may be a document obtained by cooperation with a content provider such as a publishing company on the system, a document obtained by collecting documents on the internet, a document obtained by searching data of each large database platform or establishing document sharing with each large database platform, or a document obtained by other methods, and this embodiment is not limited thereto.

It can be understood that the object data is data that is obtained from the analysis document and meets corresponding requirements, and the object data may be data stored in a local database document, data stored in a remote database document, metadata and composite data in a document from heterogeneous resources, and certainly may also be data in other documents to be analyzed, which is not limited in this embodiment.

It should be understood that the preset conversion rule is a rule for converting various types of data into metadata in a standard format, and the preset conversion rule may be a rule determined by a technician according to daily operation experience, or a more suitable rule obtained after learning and training a large amount of data, or a conversion rule set by the technician and flexibly changeable according to the current data type, or a conversion rule determined in other manners, which is not limited in this embodiment.

It can be understood that, after the server obtains the object data in the document to be analyzed, the object data is converted into a plurality of metadata in a standard format through a preset conversion rule, and the formats of various types of object data are unified into the standard format, so that the pressure of a processor in searching corresponding data can be reduced, the retrieval process is accelerated, and the retrieval speed and efficiency are improved.

Accordingly, before the step S10, the method for establishing a standard academic model further includes the following steps:

It should be noted that the preset literature standard is a standard formulated by screening the literature in the local database and the literature in the remote database, the literature in the local database and the literature in the remote database can be analyzed and extracted through the preset literature standard, and the literature meeting the standard is taken as the literature to be analyzed.

It is to be understood that the local database is a database stored in a storage area of the local server, the local database stores stored documents, the remote database is stored documents stored on the remote server or the cloud server, and of course, the remote database may also be documents acquired by connecting to the internet, but there is a precondition for acquiring documents by connecting to the internet: the resource owner is willing to be used as an interface of a data provider for Open materials, corresponding document data can be collected through an Open Archives Initiative Protocol (OAI) technology of Metadata acquisition, and for a database which does not support a manipulatable version of the OAI Protocol, a spider tool based on web page analysis can be used for capturing the Metadata into the remote database.

It should be understood that the local database is different from the remote database in that the local database can more rapidly provide the corresponding document as the document to be analyzed, which indirectly increases the speed and efficiency of user retrieval, the local database and the remote database are updated periodically, to accumulate the literature data in the database to ensure more comprehensive and complete data, the updating mode can be realized by setting a certain preset literature quantity threshold value, when the preset document quantity threshold value is reached, the operation of updating the database is automatically carried out, or a preset time threshold value is set, when the time or the time period corresponding to the preset time threshold is reached, the operation of updating the database is automatically carried out, of course, the local database and the remote database may be updated in other ways, which is not limited in this embodiment.

In a specific implementation, the preset document standard may be specified by both an academic institution and a Windows Deployment Service (WDS) publisher, so that a user may retrieve and browse a result, the user is an authorized user, different users have different permission levels, and the corresponding document may be retrieved, browsed, downloaded, edited and the like in a permission range according to the different permission levels.

Step S20, establishing indexes corresponding to each metadata, and obtaining the incidence relation among the indexes;

it should be noted that, the indexes corresponding to the metadata are established, that is, the indexes having the corresponding relationship between the metadata and the metadata are established, the corresponding metadata can be quickly found according to the indexes, the index forms of the indexes corresponding to the metadata are unified, the association relationship between the indexes is obtained by associating the indexes through the keywords in the indexes, determining whether to associate through analyzing whether the data content similarity of the metadata corresponding to the indexes meets a preset similarity threshold, determining whether to associate through comparing whether the domain correlation corresponding to the indexes meets the corresponding requirement, and certainly obtaining the association relationship between the indexes through other ways, which is not limited in this embodiment.

It can be understood that by establishing indexes corresponding to the metadata and obtaining the association relationship between the indexes, the retrieval result can be more comprehensive and complete, so that the user can retrieve the more comprehensive and complete retrieval result, the user experience is improved, the retrieval process is accelerated, and the retrieval speed and efficiency are improved.

And step S30, establishing a standard academic model according to the metadata, the indexes and the incidence relation.

It should be noted that the association between each metadata and each index can be established according to the metadata, each index and the association relationship, and the standard academic model can be established according to the associated metadata and each index.

Further, the step S30 specifically includes the following steps:

It can be understood that the preset correlation arrangement rule is a preset correlation arrangement rule, and is used for performing correlation arrangement on the metadata and the respective indexes, that is, the preset correlation arrangement rule is arranged according to the degree of correlation between the metadata and the respective index cases, and the preset correlation arrangement rule may be a rule determined by a technician according to a daily operation experience, or a more appropriate rule obtained through a large amount of experimental data or learning training, or a self-set correlation arrangement rule that is flexible and changeable according to a current data type, or certainly a correlation arrangement rule determined in other manners, which is not limited in this embodiment.

It should be understood that the preset relevance arrangement rule is used for arranging the metadata of different metadata types and the complex association relationship between the document entity object and each index corresponding to each metadata according to the relevance arrangement rule, so that the user can present the indexes in a relatively comfortable and intuitive manner after searching, and the user can obtain deeper search results by clicking the arranged indexes according to the preset relevance arrangement rule, obtain deeper search results by a facet navigation method, trigger corresponding instructions by a voice control or touch clicking method, and of course obtain deeper search results by other methods, which is not limited in this embodiment.

Accordingly, before the step S30, the method for establishing a standard academic model further includes the following steps:

It should be noted that the preset display rule is a preset display rule and is used for displaying the retrieval result obtained through the standard academic model, and the preset display rule may be a rule determined by a technician according to daily operation experience, a more appropriate rule obtained through a large amount of experimental data or after learning and training, a display rule set by the technician and flexible and changeable according to the current data type, and certainly, a display rule determined in other manners, which is not limited in this embodiment.

It can be understood that the preset presentation rule may be presented by presenting the search result in a list progressive manner, may also be presented by a manner similar to a knowledge map navigation manner, and may also be presented by other manners, which is not limited in this embodiment.

In a specific implementation, the standard academic model can be configured in a local system of an academic institution or can be placed in a remote system provider, and compared with the traditional academic institution service, the standard academic model is more flexible and can provide more degrees of freedom for the self-customized service of the academic institution.

Further, fig. 3 is a schematic flowchart of a second embodiment of the method for building a standard academic model according to the present invention, and as shown in fig. 3, the second embodiment of the method for building a standard academic model according to the present invention is proposed based on the first embodiment, in this embodiment, the step S10 specifically includes the steps of:

step S11, analyzing the document to be analyzed, acquiring an analysis result, and extracting target data which accord with a preset extraction rule from the analysis result;

it should be noted that, the document to be analyzed is analyzed and an analysis result is obtained, that is, the document meeting the requirement of the corresponding document is obtained after the document to be analyzed is analyzed, and the document meeting the requirement of the corresponding document is used as the analysis result, where the requirement of the corresponding document may be a preset document format requirement, a preset document content requirement, a preset document field or a preset region requirement, or other types of requirements, and this embodiment is not limited thereto.

It can be understood that, by using the preset extraction rule, the target data meeting the preset extraction rule may be extracted from the analysis result, where the preset extraction rule may be a rule determined by a technician according to daily operation experience, may also be a more suitable extraction rule obtained through a large amount of experimental data or after learning training, may also be an extraction rule set by the self and flexibly changeable according to the current data type, and certainly, may also be an extraction rule determined by other manners, which is not limited in this embodiment.

Step S12 of integrating the target data with document information corresponding to the target data to generate an integration result, and taking the integration result as the target data;

the document information is document information corresponding to the target data, the document information is related to a document, the document information may include a type of data, a time, a size, a field, a content summary, and the like of the document, and the document information may include other types of information, which is not limited in this embodiment.

It can be understood that a relatively complete data integration result can be obtained by integrating the target data and the bibliographic information corresponding to the target data, and the integration result is taken as the target data, and the target data includes the corresponding data information and bibliographic information corresponding to the data.

And step S13, converting the object data into a plurality of metadata in a standard format according to the preset conversion rule.

Accordingly, before the step S13, the method for establishing a standard academic model further includes:

It should be noted that the preset mapping relationship includes a corresponding relationship between the data attribute of the object data and the preset conversion rule, and the preset conversion rule corresponding to the object data with different attributes can be easily found through the preset mapping relationship, so that the processing load of the processor can be reduced, the speed and efficiency of retrieval can be further improved, and the user experience can be improved.

It can be understood that the preset conversion rule according to the data attribute is queried according to a preset mapping relationship, where the preset mapping relationship may be a mapping relationship determined by a technician according to daily operation experience, may also be a more suitable mapping relationship obtained through a large amount of experimental data or after learning training, may also be a mapping relationship that is set by the technician and is flexible and changeable according to the current data type, and certainly may also be a mapping relationship determined in other manners, which is not limited in this embodiment.

Correspondingly, the step S20 specifically includes the following steps:

step S21, searching original document information corresponding to the metadata according to the metadata;

it should be noted that original document information corresponding to the metadata is included in the document to be analyzed, and the original document information corresponds to the metadata and is equivalent to a document entity of the metadata.

Step S22, determining the index according to the original document information;

it is understood that the original document information can determine the corresponding relationship between the metadata and the original document information, and the corresponding content can be obtained from the corresponding document as the index through the original document information.

In the concrete implementation, the retrieval result of the relevance sorting can be quickly returned by collecting the literature data from the collected catalogues or manually-entered catalogues in the local database and the literature data retrieved from the remote database based on the regular system processing flow, so that an omnibearing large-scale central index can be established

Further, the step S22 specifically includes the following steps:

It should be understood that the original document includes the document content and the machine-readable directory, the document content may be the whole document, abstract and citation of the document, and of course, the document content may also be other types of document content, which is not limited in this embodiment, and the machine-readable directory may be a machine-readable directory of a collection directory, a machine-readable directory imported by a human, or other types of machine-readable directories, which is not limited in this embodiment.

In this embodiment, by the above scheme, the document to be analyzed is analyzed and an analysis result is obtained, target data meeting a preset extraction rule is extracted from the analysis result, the target data and document information corresponding to the target data are integrated to generate an integrated result, the integrated result is used as the object data, the object data is converted into a plurality of metadata in a standard format according to the preset conversion rule, original document information corresponding to the metadata is searched according to the metadata, the indexes are determined according to the original document information, the indexes are compared and a comparison result is generated, an association relationship between the indexes is determined according to the comparison result, a relationship between the document data corresponding to each object data and the indexes is rapidly obtained according to the comparison result, a retrieval process is accelerated, and a retrieval speed and efficiency are improved, by establishing the incidence relation among the indexes, the retrieval result can be more comprehensive and complete, and the user experience is improved.

In addition, an embodiment of the present invention further provides a storage medium, where a program for creating a standard academic model is stored in the storage medium, and when the program for creating a standard academic model is executed by a processor, the following operations are implemented:

Further, when executed by the processor, the standard academic model building program further implements the following operations:

determining the index according to the original literature information;

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for establishing a standard academic model is characterized by comprising the following steps:

establishing a standard academic model according to the metadata, the indexes and the association relation;

the method includes the steps that the server obtains object data in a document to be analyzed, and converts the object data into a plurality of metadata in a standard format according to a preset conversion rule, and specifically includes the following steps:

converting the object data into a plurality of metadata in a standard format according to the preset conversion rule;

before the step of obtaining object data in a document to be analyzed and converting the object data into a plurality of metadata in a standard format according to a preset conversion rule, the server further includes:

synchronizing documents in a remote database to a local database when the number of documents in the remote database is greater than or equal to a preset document number threshold; or

And when the waiting time of the remote database is greater than or equal to a preset time threshold, synchronizing the documents in the remote database to the local database.

2. The method of creating a standard academic model according to claim 1, wherein before the converting the object data into the metadata of a standard format according to the preset conversion rule, the method of creating a standard academic model further comprises:

3. The method for establishing a standard academic model according to claim 2, wherein the establishing of the indexes corresponding to the metadata and the obtaining of the association relationship among the indexes specifically comprises:

determining the index according to the original literature information;

4. The method for establishing a standard academic model according to claim 3, wherein the determining the index according to the original literature information specifically comprises:

5. The method for building a standard academic model according to any one of claims 1 to 4, wherein the building a standard academic model according to the metadata, the index and the incidence relation specifically comprises:

6. The method of building a standard academic model according to any one of claims 1 to 4, wherein after building a standard academic model from the metadata, the index and the association, the method of building a standard academic model comprises:

7. The method for creating a standard academic model according to any one of claims 1 to 4, wherein the server acquires object data in a document to be analyzed, and before converting the object data into a plurality of metadata in a standard format according to a preset conversion rule, the method for creating a standard academic model further comprises:

8. A server, characterized in that the server comprises: a memory, a processor and a standard academic model building program stored on the memory and executable on the processor, the standard academic model building program being configured to implement the steps of the standard academic model building method according to any one of claims 1 to 7.

9. A storage medium having stored thereon a standard academic model creation program which, when executed by a processor, implements the steps of the standard academic model creation method according to any one of claims 1 to 7.