CN116644228A - Multi-mode full text information retrieval method, system and storage medium - Google Patents

Multi-mode full text information retrieval method, system and storage medium Download PDF

Info

Publication number
CN116644228A
CN116644228A CN202310474800.6A CN202310474800A CN116644228A CN 116644228 A CN116644228 A CN 116644228A CN 202310474800 A CN202310474800 A CN 202310474800A CN 116644228 A CN116644228 A CN 116644228A
Authority
CN
China
Prior art keywords
file
information retrieval
text
text information
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310474800.6A
Other languages
Chinese (zh)
Inventor
刘兆武
冯漪
凌霏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Craftsman Network Technology Co ltd
Original Assignee
Shenzhen Craftsman Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Craftsman Network Technology Co ltd filed Critical Shenzhen Craftsman Network Technology Co ltd
Priority to CN202310474800.6A priority Critical patent/CN116644228A/en
Publication of CN116644228A publication Critical patent/CN116644228A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-mode full text information retrieval method, a system and a storage medium, wherein the method comprises the following steps: obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or video; judging the type of the file; identifying the file by adopting a corresponding identification strategy according to the type of the file; and outputting the identification result in a text form for subsequent data processing and analysis. The invention realizes quick and accurate retrieval of various types of files, realizes more efficient searching and management of various types of files, improves the efficiency of automatic management of files, and reduces the operation cost of enterprises.

Description

Multi-mode full text information retrieval method, system and storage medium
Technical Field
The present invention relates to the field of full text retrieval for multi-modal content, and in particular, to a method, a system, and a storage medium for retrieving multi-modal full text information.
Background
Full text retrieval is a technique for finding a particular word or phrase from a collection of documents. It is a key technology in the digital information age for rapid retrieval of large amounts of text content. If not all keywords are known, full text retrieval techniques can help quickly find the desired information.
Currently mainly used in some of the following common scenarios:
the search function of the e-commerce platform helps users to quickly find required commodities;
the article searching function of the news media website enables a user to search all relevant news according to keywords;
and the searching function of the social media platform enables the user to search all relevant contents such as users, posts, comments and the like according to the keywords.
Full text retrieval systems are currently on the market that are spread around text.
Full text retrieval technology is mature, and various text data can be effectively processed by the full text retrieval system on the market at present. However, with the development of the information age, the amount of data generated by people is continuously increasing, and the types of data of materials are more diversified. For non-text content when processing such data, for example: retrieval of video, audio, images, etc. is still difficult at the present stage and still requires manual processing and searching.
Disclosure of Invention
The invention mainly aims to provide a multi-mode full-text information retrieval method, a system and a storage medium, which aim to quickly and accurately retrieve various types of files, realize more efficient searching and management of the various types of files, improve the efficiency of automatic management of the files and reduce the operation cost of enterprises.
In order to achieve the above object, the present invention provides a multi-modal full text information retrieval method, the method comprising the steps of:
step S10, obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or videos;
step S20, judging the type of the file;
step S30, identifying the file by adopting a corresponding identification strategy according to the type of the file;
and step S40, outputting the identification result in a text form for subsequent data processing and analysis.
According to a further technical scheme of the present invention, the step S30 of identifying the file by adopting a corresponding identification policy according to the type of the file includes:
in step S301, if the file type is an audio or video file, an ASR technique is used to perform speech content recognition on the file.
In a further technical scheme of the present invention, if the file type is an audio or video file, the step of performing speech content recognition on the file by using an ASR technique includes:
step S3011, preprocessing voice data: unifying the audio-video file into a sample rate 16k, mono audio data file using an open source tool ffmpeg;
step S3012, feature extraction: converting the preprocessed voice signal into a feature vector by adopting MFCC feature extraction;
step S3013, recognition: and identifying the feature vector sequence by using an acoustic model deep neural network model and a neural network language model, and finding out the most suitable text sequence, namely an identification result.
According to a further technical scheme of the present invention, the step S30 of identifying the file by adopting a corresponding identification policy according to the type of the file includes:
step S302, if the file type is a document file which is not text or picture, converting the document file into a picture file;
step S303, the character content of the picture and the position of the character in the picture are extracted by adopting an OCR technology.
The further technical scheme of the present invention is that, the step S303, the step of extracting the text content of the picture and the position of the text in the picture by using the OCR technology, includes:
step S3031, image preprocessing: the original image is converted into a format suitable for feature extraction and character recognition through carrying out light correction, noise removal, convolution smoothing and image binarization processing on the image;
step S3032, character segmentation: in the preprocessed image, each character is segmented from continuous letter phrase based on a histogram projection algorithm, so that the character recognition accuracy is improved;
step S3033, feature extraction: extracting useful features from the preprocessed image of the character by using a Canny edge detection algorithm, wherein the features are used for representing the shape, outline and boundary information of the character;
step S3034, character recognition: the extracted features are converted into computer-processable feature vectors, which are used to identify characters using a convolutional neural network deep learning architecture.
In the further technical scheme of the invention, in the step S30, in which the file is identified by adopting a corresponding identification policy according to the type of the file, if the type of the file is a picture file, the step S50 is directly executed, and the OCR technology is adopted to extract the text content of the picture and the position of the text in the picture.
According to a further technical scheme of the invention, the steps of outputting the identification result in a text form for subsequent data processing and analysis comprise the following steps: and uploading the identification result to an elastic search for storage in a text form so as to facilitate retrieval by a system.
According to the technical scheme, the related search word recommendation and risk search word recommendation functions on the platform are realized by adopting an elastic search self-contained search algorithm, wherein the elastic search self-contained search algorithm is a BM25 algorithm.
To achieve the above object, the present invention also proposes a multimodal full text information retrieval system comprising a memory, a processor and a multimodal full text information retrieval program stored on the processor, which multimodal full text information retrieval program, when run by the processor, performs the steps of the method as described above.
To achieve the above object, the present invention also proposes a computer-readable storage medium storing a multi-modal full-text information retrieval program which, when executed by a processor, performs the steps of the method as described above.
The multi-mode full-text information retrieval method, system and storage medium have the beneficial effects that: the invention adopts the technical scheme that the method comprises the following steps: step S10, obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or videos; step S20, judging the type of the file; step S30, identifying the file by adopting a corresponding identification strategy according to the type of the file; and step S40, outputting the identification result in a text form for subsequent data processing and analysis, so that various types of files can be quickly and accurately searched, more efficient searching and management of various types of files can be realized, the efficiency of automatic management of the files can be improved, and the operation cost of enterprises can be reduced.
Drawings
FIG. 1 is a flowchart of a multi-modal full-text information retrieval method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a second embodiment of a multi-modal full-text information retrieval method according to the present invention;
fig. 3 is a schematic diagram of a refinement flow of step S301 in fig. 2;
FIG. 4 is a flowchart of a third embodiment of a multi-modal full-text information retrieval method according to the present invention;
fig. 5 is a schematic diagram of a refinement flow of step S303 in fig. 4.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, the present invention provides a multi-modal full-text information retrieval method, and a first embodiment of the multi-modal full-text information retrieval method of the present invention includes the following steps:
step S10, obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or videos.
And step S20, judging the type of the file.
And step S30, identifying the file by adopting a corresponding identification strategy according to the type of the file.
In this embodiment, for text files, all text contents are read by using a conventional manner in the prior art, for audio or video files, an ASR technique is used to identify speech contents with a result of relative time, for picture files, an OCR technique is used to extract the positions of text contents and characters in a picture, and for some document files that are not text or picture, the text contents and characters are converted into pictures, and then content identification is performed by using a picture identification manner, so as to obtain a result.
And step S40, outputting the identification result in a text form for subsequent data processing and analysis.
After the file content is successfully identified, it is uploaded to a corresponding data storage system for subsequent processing. In this embodiment, the identification result of the file is selected to be uploaded to the elastic search for saving, so that the system can search conveniently.
In addition, the embodiment mainly uses the search algorithm of the elastic search to realize the functions of relevant search word recommendation, risk search word recommendation and the like on the platform. The algorithm mainly used is: the BM25 algorithm (Best Matching 25) calculates a score based on word frequency and document frequency and ranks the scores. The BM25 algorithm assigns a score to each document that indicates the relevance of the document to the query.
By adopting the multi-mode full-text information retrieval method provided by the embodiment, a user only needs to upload various types of files which need to be managed to the multi-mode full-text information retrieval system. Even if a user accumulates a large amount of relevant files such as files, pictures and videos, the user can quickly and accurately search the required files when the user needs to search a certain file in a huge amount of the files, so that more efficient searching and management of the files are realized.
With the advent of the digital age, various documents will grow exponentially, and the multi-mode full-text information retrieval method provided by the embodiment can effectively improve the efficiency of file automatic management, reduce the operation cost of enterprises, and simultaneously reduce the workload of manual input of users, so that the work becomes easier and more efficient.
Further, referring to fig. 2, a second embodiment of the multi-mode full text information retrieval method according to the present invention is provided based on the first embodiment shown in fig. 1, and the difference between the present embodiment and the first embodiment shown in fig. 1 is that in the present embodiment, the step S20 of identifying the file by adopting a corresponding identification policy according to the type of the file includes:
in step S301, if the file type is an audio or video file, an ASR technique is used to perform speech content recognition on the file.
Specifically, as shown in fig. 3, in the embodiment, step S301, if the file type is an audio or video file, the step of performing speech content recognition on the file by using the ASR technology includes:
step S3011, preprocessing voice data: the audio-video file is unified into a sample rate 16k, mono audio data file using the open source tool ffmpeg.
Step S3012, feature extraction: the preprocessed speech signal is converted into feature vectors using MFCC feature extraction.
Step S3013, recognition: and identifying the characteristic vector sequence by using an acoustic model deep neural network model (DNN) and a Neural Network Language Model (NNLM), and finding out the most suitable text sequence, namely an identification result.
Further, referring to fig. 4, a third embodiment of the multi-modal full-text information retrieval method according to the present invention is provided based on the multi-modal full-text information retrieval method shown in fig. 1, and the difference between the present embodiment and the first embodiment shown in fig. 1 is that in the present embodiment, the step S30 of identifying the file by adopting a corresponding identification policy according to the type of the file includes:
step S302, if the file type is a document file which is not text or picture, converting the document file into a picture file;
step S303, the character content of the picture and the position of the character in the picture are extracted by adopting an OCR technology.
Specifically, referring to fig. 5, in step S303, the step of extracting the text content of the picture and the position of the text in the picture by using the OCR technology includes:
step S3031, image preprocessing: the original image is converted into a format suitable for feature extraction and character recognition by performing light correction, noise removal, convolution smoothing, and image binarization processing on the image.
Step S3032, character segmentation: in the preprocessed image, each character is segmented from a continuous alphabetic phrase based on a histogram projection algorithm to improve the accuracy of character recognition.
Step S3033, feature extraction: useful features are extracted from the preprocessed image of the character using the Canny edge detection algorithm, and are used to represent the shape, outline, and boundary information of the character.
Step S3034, character recognition: the extracted features are converted into computer-processable feature vectors, which are used to identify characters using a Convolutional Neural Network (CNN) deep learning architecture.
Referring to fig. 4, in the embodiment, in the step S30, a corresponding recognition policy is adopted to recognize the file according to the type of the file, if the type of the file is a picture file, the step S50 is directly executed, and the OCR technology is adopted to extract the text content of the picture and the position of the text in the picture.
The multi-mode full-text information retrieval method has the beneficial effects that: the invention adopts the technical scheme that the method comprises the following steps: step S10, obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or videos; step S20, judging the type of the file; step S30, identifying the file by adopting a corresponding identification strategy according to the type of the file; and step S40, outputting the identification result in a text form for subsequent data processing and analysis, so that various types of files can be quickly and accurately searched, more efficient searching and management of various types of files can be realized, the efficiency of automatic management of the files can be improved, and the operation cost of enterprises can be reduced.
In order to achieve the above objective, the present invention further provides a multi-modal full text information retrieval system, where the system includes a memory, a processor, and a multi-modal full text information retrieval program stored on the processor, and the steps of the method described in the above embodiments are executed by the processor when the multi-modal full text information retrieval program is executed by the processor, which is not repeated herein.
To achieve the above objective, the present invention further provides a computer readable storage medium, where a multi-modal full-text information retrieval program is stored, and the steps of the method described in the above embodiments are executed by a processor when the multi-modal full-text information retrieval program is executed by the processor, which is not repeated herein.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or equivalent processes using the descriptions and drawings of the present invention or directly or indirectly applied to other related technical fields are included in the scope of the invention.

Claims (10)

1. A multi-modal full text information retrieval method, the method comprising the steps of:
step S10, obtaining files of different types to be managed, wherein the types of the files comprise one or more of texts, pictures, audio or videos;
step S20, judging the type of the file;
step S30, identifying the file by adopting a corresponding identification strategy according to the type of the file;
and step S40, outputting the identification result in a text form for subsequent data processing and analysis.
2. A multi-modal full-text information retrieval method as set forth in claim 1 wherein the step of identifying the document using a corresponding identification policy based on the type of the document includes:
in step S301, if the file type is an audio or video file, an ASR technique is used to perform speech content recognition on the file.
3. A multimodal full text information retrieval method as claimed in claim 2 wherein step S301, if the type of the document is an audio or video document, the step of performing speech content recognition on the document using ASR techniques comprises:
step S3011, preprocessing voice data: unifying the audio-video file into a sample rate 16k, mono audio data file using an open source tool ffmpeg;
step S3012, feature extraction: converting the preprocessed voice signal into a feature vector by adopting MFCC feature extraction;
step S3013, recognition: and identifying the feature vector sequence by using an acoustic model deep neural network model and a neural network language model, and finding out the most suitable text sequence, namely an identification result.
4. A multi-modal full-text information retrieval method as set forth in claim 1 wherein the step of identifying the document using a corresponding identification policy based on the type of the document includes:
step S302, if the file type is a document file which is not text or picture, converting the document file into a picture file;
step S303, the character content of the picture and the position of the character in the picture are extracted by adopting an OCR technology.
5. The method of claim 4, wherein the step of extracting the text content of the picture and the text position in the picture by using OCR technology in step S303 comprises:
step S3031, image preprocessing: the original image is converted into a format suitable for feature extraction and character recognition through carrying out light correction, noise removal, convolution smoothing and image binarization processing on the image;
step S3032, character segmentation: in the preprocessed image, each character is segmented from continuous letter phrase based on a histogram projection algorithm, so that the character recognition accuracy is improved;
step S3033, feature extraction: extracting useful features from the preprocessed image of the character by using a Canny edge detection algorithm, wherein the features are used for representing the shape, outline and boundary information of the character;
step S3034, character recognition: the extracted features are converted into computer-processable feature vectors, which are used to identify characters using a convolutional neural network deep learning architecture.
6. The method of claim 5, wherein in the step of identifying the file by using a corresponding identification policy according to the type of the file in step S30, if the type of the file is a picture file, the step S50 is directly executed, and the OCR technology is used to extract the positions of the text content and the text of the picture in the figure.
7. A multimodal full text information retrieval method as claimed in any one of claims 1 to 6 wherein said step of outputting the recognition results in text form for subsequent data processing and analysis includes: and uploading the identification result to an elastic search for storage in a text form so as to facilitate retrieval by a system.
8. The multi-modal full-text information retrieval method according to claim 7, wherein the related search word recommendation and risk search word recommendation functions on the platform are realized by adopting an elastic search self-contained retrieval algorithm, wherein the elastic search self-contained retrieval algorithm is a BM25 algorithm.
9. A multimodal full text information retrieval system, the system comprising a memory, a processor, and a multimodal full text information retrieval program stored on the processor, which when executed by the processor performs the steps of the method of any of claims 1 to 8.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a multimodal full text information retrieval program which, when executed by a processor, performs the steps of the method according to any of claims 1 to 8.
CN202310474800.6A 2023-04-26 2023-04-26 Multi-mode full text information retrieval method, system and storage medium Pending CN116644228A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310474800.6A CN116644228A (en) 2023-04-26 2023-04-26 Multi-mode full text information retrieval method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310474800.6A CN116644228A (en) 2023-04-26 2023-04-26 Multi-mode full text information retrieval method, system and storage medium

Publications (1)

Publication Number Publication Date
CN116644228A true CN116644228A (en) 2023-08-25

Family

ID=87619405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310474800.6A Pending CN116644228A (en) 2023-04-26 2023-04-26 Multi-mode full text information retrieval method, system and storage medium

Country Status (1)

Country Link
CN (1) CN116644228A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117033308A (en) * 2023-08-28 2023-11-10 中国电子科技集团公司第十五研究所 Multi-mode retrieval method and device based on specific range

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117033308A (en) * 2023-08-28 2023-11-10 中国电子科技集团公司第十五研究所 Multi-mode retrieval method and device based on specific range
CN117033308B (en) * 2023-08-28 2024-03-26 中国电子科技集团公司第十五研究所 Multi-mode retrieval method and device based on specific range

Similar Documents

Publication Publication Date Title
CN107480200B (en) Word labeling method, device, server and storage medium based on word labels
US10572528B2 (en) System and method for automatic detection and clustering of articles using multimedia information
Zagoris et al. A document image retrieval system
CN1748213A (en) Method and apparatus for content representation and retrieval in concept model space
US7739110B2 (en) Multimedia data management by speech recognizer annotation
WO2023065617A1 (en) Cross-modal retrieval system and method based on pre-training model and recall and ranking
CN111276149B (en) Voice recognition method, device, equipment and readable storage medium
CN112004164B (en) Automatic video poster generation method
CN114416979A (en) Text query method, text query equipment and storage medium
CN116644228A (en) Multi-mode full text information retrieval method, system and storage medium
CN111291168A (en) Book retrieval method and device and readable storage medium
CN117010500A (en) Visual knowledge reasoning question-answering method based on multi-source heterogeneous knowledge joint enhancement
CN110795942A (en) Keyword determination method and device based on semantic recognition and storage medium
CN116881463B (en) Artistic multi-mode corpus construction system based on data
CN109684357B (en) Information processing method and device, storage medium and terminal
KR101800975B1 (en) Sharing method and apparatus of the handwriting recognition is generated electronic documents
CN115203474A (en) Automatic database classification and extraction technology
CN114780757A (en) Short media label extraction method and device, computer equipment and storage medium
CN113743352A (en) Method and device for comparing similarity of video contents
CN108882033B (en) Character recognition method, device, equipment and medium based on video voice
CN113297485A (en) Method for generating cross-modal representation vector and cross-modal recommendation method
CN114764437A (en) User intention identification method and device and electronic equipment
Khollam et al. A survey on content based lecture video retrieval using speech and video text information
CN110717091B (en) Entry data expansion method and device based on face recognition
CN117493645B (en) Big data-based electronic archive recommendation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination