CN110750501A - File retrieval method and device, storage medium and related equipment - Google Patents

File retrieval method and device, storage medium and related equipment Download PDF

Info

Publication number
CN110750501A
CN110750501A CN201910989351.2A CN201910989351A CN110750501A CN 110750501 A CN110750501 A CN 110750501A CN 201910989351 A CN201910989351 A CN 201910989351A CN 110750501 A CN110750501 A CN 110750501A
Authority
CN
China
Prior art keywords
file
whiteboard
retrieval
data
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910989351.2A
Other languages
Chinese (zh)
Inventor
吴诗乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shirui Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd, Guangzhou Shirui Electronics Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN201910989351.2A priority Critical patent/CN110750501A/en
Publication of CN110750501A publication Critical patent/CN110750501A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the application discloses a file retrieval method and device, a storage medium and related equipment, and belongs to the technical field of intelligent interactive panels. Wherein, the method comprises the following steps: receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of the target whiteboard file; processing at least one whiteboard file to obtain the file content of each whiteboard file; and matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file. Therefore, the whiteboard file required by the user can be quickly found, and the technical problem that in the related technology, the whiteboard file is searched through the file name or the creating time, and the accuracy is low is solved.

Description

File retrieval method and device, storage medium and related equipment
Technical Field
The application relates to the field of intelligent interactive panels, in particular to a file retrieval method and device, a storage medium and related equipment.
Background
After a user writes drawing contents by using the electronic whiteboard file, the whiteboard file can be saved, and the user generally names the file by using time or brief description. When a user needs to search a file, the current technical scheme is to search the file by matching search conditions with information such as a file name and file creation time.
However, when the number of the whiteboard files saved by the user is large, the user needs to open a certain whiteboard file, and the user often forgets the file name or the file creation time, and cannot quickly find the whiteboard file only by the file name.
Aiming at the problem of low accuracy rate of retrieving the whiteboard file through the file name or the creation time in the related art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the application provides a file retrieval method and device, a storage medium and related equipment, and aims to at least solve the technical problem of low accuracy rate of retrieving a whiteboard file through a file name or creation time in related technologies.
According to a first aspect of embodiments of the present application, there is provided a file retrieval method, including: receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of a target whiteboard file; processing at least one whiteboard file to obtain the file content of each whiteboard file; and matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
Optionally, processing at least one whiteboard file, and obtaining the file content of each whiteboard file includes: analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and performing content identification on the point data to obtain the file content.
Optionally, analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file includes: reading the data of the target tag in the whiteboard file; and analyzing the data of the target label to obtain an attribute value of the target attribute in the target label to obtain the handwriting data.
Optionally, performing content identification on the point data, and obtaining the file content includes: dividing the point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain the file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
Optionally, the handwriting data further comprises: the method comprises the following steps of obtaining a handwriting region, wherein the handwriting region comprises a handwriting region, and the obtaining of a point data set comprises the following steps: acquiring the distance between adjacent handwriting based on the area data; and under the condition that the distance is smaller than or equal to a preset distance, determining that the point data of the adjacent handwriting belongs to the same point data set.
Optionally, the handwriting data further comprises: the time stamp of the handwriting, wherein the dividing of the point data to obtain the point data set comprises: acquiring a time interval between adjacent handwriting based on the timestamp; and under the condition that the time interval is less than or equal to a preset interval, determining that the point data of the adjacent handwriting belongs to the same point data set.
Optionally, performing content identification on the point data, and obtaining the file content includes: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
Optionally, the retrieval condition for receiving the user input includes at least one of: receiving text data input by a user to obtain the retrieval condition; and receiving voice data input by a user, and identifying the voice data to obtain the retrieval condition.
Optionally, in a case that the file content of the target whiteboard file contains a plurality of elements, the retrieval condition contains at least one element.
According to a second aspect of the embodiments of the present application, there is provided a file retrieval method, including: displaying a retrieval interface, wherein the retrieval interface comprises: an input control for inputting a retrieval condition; receiving a retrieval condition input in the input control, wherein the retrieval condition corresponds to the file content of the target whiteboard file; processing at least one whiteboard file to obtain the file content of each whiteboard file; matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file; and displaying the target whiteboard file.
Optionally, processing at least one whiteboard file, and obtaining the file content of each whiteboard file includes: analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and performing content identification on the point data to obtain the file content.
Optionally, analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file includes: reading the data of the target tag in the whiteboard file; and analyzing the data of the target label to obtain an attribute value of the target attribute in the target label to obtain the handwriting data.
Optionally, performing content identification on the point data, and obtaining the file content includes: dividing the point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain the file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
Optionally, performing content identification on the point data, and obtaining the file content includes: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
According to a third aspect of the embodiments of the present application, there is provided a file retrieval method, including: displaying a retrieval interface; under the condition that a retrieval condition is input in an interaction area of the retrieval interface, matching the retrieval condition with the file content of each whiteboard file; under the condition of successful matching, obtaining a target whiteboard file matched with the retrieval condition; and displaying the target whiteboard file.
Optionally, processing at least one whiteboard file, and obtaining the file content of each whiteboard file includes: analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and performing content identification on the point data to obtain the file content.
Optionally, analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file includes: reading the data of the target tag in the whiteboard file; and analyzing the data of the target label to obtain an attribute value of the target attribute in the target label to obtain the handwriting data.
Optionally, performing content identification on the point data, and obtaining the file content includes: dividing the point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain the file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
Optionally, performing content identification on the point data, and obtaining the file content includes: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
According to a fourth aspect of embodiments of the present application, there is provided a file retrieval apparatus including: the system comprises a condition receiving module, a condition processing module and a condition processing module, wherein the condition receiving module is used for receiving a retrieval condition input by a user, and the retrieval condition corresponds to the file content of a target whiteboard file; the file processing module is used for processing at least one whiteboard file to obtain the file content of each whiteboard file; and the content matching module is used for matching the retrieval conditions with the file content of each whiteboard file and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
According to a fifth aspect of embodiments of the present application, there is provided a file retrieval apparatus including: the interface display module is used for displaying a retrieval interface, wherein the retrieval interface comprises: an input control for inputting a retrieval condition; the condition receiving module is used for receiving the retrieval condition input in the input control, wherein the retrieval condition corresponds to the file content of the target whiteboard file; the file processing module is used for processing at least one whiteboard file to obtain the file content of each whiteboard file; the content matching module is used for matching the retrieval conditions with the file content of each whiteboard file and acquiring the whiteboard files successfully matched to obtain the target whiteboard file; and the file display module is used for displaying the target whiteboard file.
According to a sixth aspect of embodiments of the present application, there is provided a document retrieval apparatus including: the interface display module is used for displaying a retrieval interface; the content matching module is used for matching the search conditions with the file content of each whiteboard file under the condition that the search conditions are input in the interactive area of the search interface; the file acquisition module is used for acquiring a target whiteboard file matched with the retrieval condition under the condition of successful matching; and the file display module is used for displaying the target whiteboard file.
According to a seventh aspect of embodiments of the present application, there is provided a computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the above-mentioned method steps.
According to an eighth aspect of embodiments of the present application, there is provided an intelligent interactive tablet, including: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the above-mentioned method steps.
In the embodiment of the application, the target whiteboard file corresponding to the retrieval condition can be obtained by matching the retrieval condition input by the user with the file contents of all whiteboard files. Due to the fact that the user has a deep impression on the file content, the file retrieval can be achieved by inputting the retrieval conditions corresponding to the file content under the condition that the user forgets the file name or the creation time, the technical problem that in the related technology, the white board file is retrieved through the file name or the creation time, the accuracy is low is solved, and the technical effect of improving the file retrieval efficiency and the accuracy is achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a first document retrieval method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a whiteboard file according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating the display effect of a search interface according to an embodiment of the present application;
FIG. 4 is a flow chart of another document retrieval method according to an embodiment of the present application;
fig. 5 is a schematic diagram of a CNN network structure according to an embodiment of the present application;
fig. 6 is a schematic diagram of an RNN network structure according to an embodiment of the present application;
FIG. 7 is a schematic diagram of the relative positions of rectangle 1 and rectangle 2 according to an embodiment of the present application;
fig. 8 is a schematic diagram of whiteboard file division based on regions according to an embodiment of the present application;
FIG. 9 is a schematic diagram of an OCR recognition implementation flow according to an embodiment of the application;
fig. 10 is a schematic diagram of a CTPN implementation flow according to an embodiment of the present application;
fig. 11 is a schematic diagram of a SegLink network architecture according to an embodiment of the present application;
FIG. 12 is a schematic diagram of an EAST implementation method according to an embodiment of the present application;
FIG. 13 is a flow chart of a second method of document retrieval according to an embodiment of the present application;
FIG. 14 is a flow chart of a third method of document retrieval according to an embodiment of the present application;
FIG. 15 is a diagram illustrating a hardware environment for a document retrieval method according to an embodiment of the present application;
FIG. 16 is a schematic diagram illustrating the display effect of a search result according to an embodiment of the present application;
FIG. 17 is a schematic view of a first document retrieval apparatus according to an embodiment of the present application;
FIG. 18 is a schematic view of a second document retrieval apparatus according to an embodiment of the present application;
FIG. 19 is a schematic diagram of a third document retrieval apparatus according to an embodiment of the present application;
fig. 20 is a schematic structural diagram of an intelligent interactive tablet according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
The intelligent interactive panel can be an integrated device which controls the content displayed on the display panel and realizes man-machine interactive operation through a touch technology, and integrates one or more functions of a projector, an electronic whiteboard, a curtain, a sound, a television, a video conference terminal and the like. The hardware part of mutual dull and stereotyped of intelligence comprises parts such as display module assembly, intelligent processing system (including the controller), combines together by whole structure, also is regarded as the support by dedicated software system simultaneously, and wherein the display module assembly includes display screen and backlight module spare, and wherein the display screen includes transparent electric conduction layer and liquid crystal layer etc..
The display screen, in the embodiments of the present specification, refers to a touch screen, and a touch panel, and is an inductive liquid crystal display device, when a graphical button on the screen is touched, the tactile feedback system on the screen can drive various connection devices according to a pre-programmed program, so as to replace a mechanical button panel, and create a vivid video effect by using a liquid crystal display screen. Touch screens are distinguished from technical principles and can be divided into five basic categories; a vector pressure sensing technology touch screen, a resistance technology touch screen, a capacitance technology touch screen, an infrared technology touch screen, and a surface acoustic wave technology touch screen. According to the working principle of the touch screen and the medium for transmitting information, the touch screen can be divided into four categories: resistive, capacitive, infrared, and surface acoustic wave.
When a user touches the screen with a finger or a pen, the point coordinates are positioned, so that the control of the intelligent processing system is realized, and then different functional applications are realized along with the built-in software of the intelligent processing system.
The 'screen' and 'large screen' mentioned in the application refer to the display screen of the intelligent interactive flat panel; the intelligent interaction panel displays a certain interface, namely the display screen of the intelligent interaction panel displays the interface.
When a user operates the intelligent interactive tablet, the electronic whiteboard file installed in the intelligent interactive tablet can be used for writing and drawing contents, wherein the contents can be texts, graphs, simplified strokes and the like. After writing, the file name can be input and the whiteboard file can be saved, and optionally, the user generally uses time or brief description as the file name. The user may open the whiteboard file after a period of time, and if a large number of whiteboard files have been stored before, the user may have forgotten the file name and the creation time of the file, and at this time, the whiteboard file cannot be queried through the file name. Therefore, there is a problem that the accuracy is low when the whiteboard file is searched by the file name or the creation time.
In order to solve the foregoing technical problem, embodiments of the present application provide a file retrieval method and apparatus, a storage medium, and a related device, where the implementation of the foregoing scheme is based on the following background: when a user searches a whiteboard file, the user may forget the file name of the whiteboard file, but the user may have a certain impression on the contents in the whiteboard file, for example, the user remembers that a triangle is drawn in the whiteboard file, or writes a piece of text.
Example 1
In the embodiment of the present application, a whiteboard file retrieval is taken as an example to illustrate a description manner of a specific embodiment:
according to the embodiment of the application, the file retrieval method is applied to the intelligent interactive flat panel.
The following describes in detail the document retrieval method provided in the embodiment of the present application with reference to fig. 1 to 3. As shown in fig. 1, the method comprises the steps of:
step S102, receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of the target whiteboard file;
the target whiteboard file may be a whiteboard file already stored in the intelligent interactive tablet, the file content may be content of text, simple strokes, graphics, and the like written and drawn in the whiteboard file by the user, and multiple elements may be written in one whiteboard file, for example, a user may draw an element in an electronic whiteboard file as shown in fig. 2, and the file content of the whiteboard file may include three elements, which are respectively text "Hello World", an apple, and a triangle.
In this embodiment of the present application, a user may input a retrieval condition according to an impression of contents of a whiteboard file, that is, the retrieval condition may be any element written and drawn in a target whiteboard file by the user, for example, for a whiteboard file as shown in fig. 2, when the user needs to retrieve the file, the user may input "Hello World". In this embodiment, the user may input the search condition in various ways, including but not limited to: character input and voice input. In an exemplary embodiment of the present application, when text input is adopted, text data input by a user may be received, and a search condition is obtained; when the voice input is adopted, the voice data input by the user can be received, and the voice data is identified to obtain corresponding text information as a retrieval condition.
The intelligent interactive tablet provides a retrieval page for a user, an input control is displayed in an interactive area of the retrieval page, and the user can input retrieval conditions by touching the input control. When text input is employed, a text box may be displayed in the interaction area, and the user may enter search text in the text box by touching the text box; when voice input is employed, a voice button may be displayed in the interactive area, and voice is input by long-pressing the voice button. One possible display manner is shown in fig. 3, where a text box and a voice button are displayed in the interactive area of the search page, and the user can select any one manner to input the search condition.
Step S104, processing at least one whiteboard file to obtain the file content of each whiteboard file;
the user can search all the whiteboard files stored in the intelligent interactive flat plate, and as the search condition is any one element in the text content, all the whiteboard files in the intelligent interactive flat plate need to be processed, and the file content of each whiteboard file is analyzed and identified, that is, all the elements contained in each whiteboard file are identified and obtained.
And step S106, matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
The successful matching means that the search condition is the same as any one element contained in the target whiteboard file. In an exemplary embodiment of the present application, in a case that the file content of the target whiteboard file includes a plurality of elements, the retrieval condition includes at least one element, for example, for the whiteboard file as shown in fig. 2, the user can search the whiteboard file by inputting any one of the text "HelloWorld", apple, and triangle.
In the embodiment of the application, the target whiteboard file corresponding to the retrieval condition can be obtained by matching the retrieval condition input by the user with the file contents of all whiteboard files. In the embodiment of the application, the retrieval condition input by the user and the content of the target whiteboard file, that is, the file retrieval is realized based on the file content, and not only the resources with small information amount, such as the file name or the creation time of the whiteboard file, are retrieved, but also the impression of the user on the file content is deep, so that the rapid retrieval of the whiteboard file can be realized, and the retrieval accuracy is high.
Example 2
As shown in fig. 4, the method includes the steps of:
step S402, receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of the target whiteboard file;
before the user searches the file, a search interface is displayed on a display screen of the intelligent interactive tablet, an input control is displayed in an interactive area of a search page, and the user inputs a search condition by touching the input control.
In an exemplary embodiment of the present application, receiving the search condition may be performed as follows:
when text input is used, the receiving process may be a process in which the system reads the parameter value of the input box after the user inputs text in the input box. In the Android system, an EditText control can be used, and when a user clicks a determination button, a getText () function is used to obtain input contents in an input box.
When voice input is adopted, the receiving process may mean that a user presses a voice key for a long time to generate touch operation, and the system collects voice data sent by the user through a microphone. In the Android system, the receiving and transmitting processes of touch operation are as follows: monitoring a long press event of a voice key through a setOnLongClickListener function, triggering an onLongClick callback when the long press event occurs, and then calling a system sensor voice input API:
Intent intent=new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
startActivityForResult(intent,InputResultCode);
the system speech recognition function will be started and the recognition result will be received via the onactive result function.
Step S404, each whiteboard file is analyzed to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting;
in this embodiment, the whiteboard file generally has a certain storage format, for example, the whiteboard file may be in a universal format IWB or a format defined by a whiteboard manufacturer. Taking the universal format IWB as an example, for hand-drawn handwriting, the handwriting is composed of a plurality of points, and therefore, the whiteboard file generally stores point data of the handwriting, that is, coordinate values of each point. In an exemplary embodiment of the present application, the step of parsing each whiteboard file is as follows: reading data of a target label in a whiteboard file; and analyzing the data of the target label to obtain the attribute value of the target attribute in the target label to obtain the handwriting data.
In the whiteboard file, the point data of each handwriting is stored separately, and in order to distinguish the point data of different handwriting, the point data of each handwriting can be stored according to a preset format, and a corresponding target label is set.
In the embodiments of the present application, the description will be given by taking the "svg" label as an example. The dot data for a piece of handwriting is as follows:
Figure BDA0002237743740000091
based on the generic format IWB, the target tag is located between "<" and ">", and the data of the target tag is located after the tag. For different systems, functions for reading target tag data are different, and in the android system, the data of the target tag can be read through the following steps:
according to the defined label and the data type thereof, firstly creating a corresponding bean class, reading a file data stream through InputStream, and finally analyzing the attribute of the xml label by using any one of XmlPullParser, SAXParser, documentBuilder and the like.
Taking SAXParser as an example, when the value is "<", executing a startElement method, outputting data through a System.out.println () function, when the value is ">", executing an endElement method, obtaining an attribute name through a getQName () function, and obtaining an attribute value through a getValue () function.
For a handwriting, the whiteboard board not only stores the point data of the handwriting, but also the area where the handwriting is located, the drawing time, the color, the transparency, the width and the like of the handwriting, in the whiteboard file, different data of the handwriting can be represented through different attributes, for example, a "points" attribute represents the point data, a "bound" attribute represents the area where the handwriting is located, a "time" attribute represents the drawing time, a "stroke" attribute represents the color, a "stroke-accessibility" attribute represents the transparency, and a "stroke-width" attribute represents the width. In order to recognize the content drawn by the handwriting, the point data of the handwriting can be extracted, that is, the attribute value of the "points" attribute is obtained from the data of the target label. And loading data into the created bean object by the three ways of reading the xml, and then directly obtaining the attribute value by the get way.
Step S406, carrying out content identification on the point data to obtain file content;
in this embodiment, there may be two ways to recognize the handwriting: optical character recognition, OCR, and point-data-based recognition, where OCR recognition techniques can only recognize characters, while point-data-based recognition techniques can recognize graphics, characters, simple strokes, and the like. For the two different approaches, different processing of the data is required.
In an exemplary embodiment of the embodiments of the present application, the identification method based on point data is as follows: dividing point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
The principle of the point-data-based recognition method is as follows: the point data in one area has correlation, a large amount of point data of handwriting can be collected in advance to serve as a data set, corresponding content is marked for the point data of each handwriting manually, one part of data serves as training data, model training is carried out by using a deep neural network, the other part of data serves as test data, the trained model is tested, and the finally obtained recognition model can recognize what the handwriting is through inputting the point data of the handwriting.
The deep Neural Network model may be CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), or the like. The CNN links the upper and lower layers through convolution kernels, the same convolution kernel is shared in all point data, the original position relationship of the point data is still maintained after convolution operation, and the network structure is shown in fig. 5. In the CNN, the signal of each layer of neurons can only propagate to the upper layer, the processing of the sample is independent at each time, while in the RNN, the output of the neuron can directly act on itself in the next time period, and the network structure is shown in fig. 6.
As can be seen from the above, the most important work of the identification method based on point data is to divide the content of a whiteboard, and the dividing manner may be: region-based partitioning, or handwriting-time based partitioning. Optionally, for the area-based division, the acquired handwriting data not only includes the point data, but also includes the area data of the area where the handwriting is located, and the specific implementation steps are as follows: acquiring the distance between adjacent handwriting based on the area data; and under the condition that the distance is smaller than or equal to the preset distance, determining that the point data of the adjacent handwriting belong to the same point data set.
Specifically, the area data may be obtained by reading an attribute value of the "bound" attribute. During the writing process of the handwriting, a user may draw multiple pieces of handwriting for the same content, for example, as shown in fig. 2, the "Hello" may be composed of four pieces of handwriting, and the "World" may also be composed of four pieces of handwriting. The distance between the plurality of handwriting of the same content is short, and the distance between the plurality of handwriting of different content is long. Based on the above principle, for all the acquired region data, the distances between the edges of the regions can be calculated, and the point data with the distance smaller than a certain threshold value is classified into one class (i.e. divided into the same point data set). For the point data of the same class, the data can be input into the recognition model for recognition at one time to obtain corresponding contents. In the embodiment of the present application, the following method may be adopted to calculate the edge distance between each two regions:
as shown in fig. 7, the processing can be performed according to the relative positions of the rectangle 1 and the rectangle 2, specifically as follows: if the two rectangles have the condition of intersection, the distance is 0; if the rectangle 1 is at the upper left corner of the rectangle 2, the distance is the distance between the lower right corner of the rectangle 1 and the upper left corner of the rectangle 2; if the rectangle 1 is at the lower left corner of the rectangle 2, the distance is the distance between the upper right corner of the rectangle 1 and the lower left corner of the rectangle 2; if the rectangle 1 is on the left side of the rectangle 2, the distance is the distance between the right frame of the rectangle 1 and the left frame of the rectangle 2; if the rectangle 1 is at the upper right corner of the rectangle 2, the distance is the distance between the lower left corner of the rectangle 1 and the upper right corner of the rectangle 2; if the rectangle 1 is at the lower right corner of the rectangle 2, the distance is the distance between the upper left corner of the rectangle 1 and the lower right corner of the rectangle 2; if the rectangle 1 is on the right side of the rectangle 2, the distance is the distance between the left frame of the rectangle 1 and the right frame of the rectangle 2; if the rectangle 1 is above the rectangle 2, the distance is the distance between the lower frame of the rectangle 1 and the upper frame of the rectangle 2; if rectangle 1 is below rectangle 2, then the distance is the distance between the upper border of rectangle 1 and the lower border of rectangle 2.
The preset distance may be a threshold value for dividing the point data, and a specific value of the preset distance is determined by analyzing a large amount of handwriting in advance, for example, the preset distance may be 0. As shown by the solid line box in fig. 8, "Hello" is composed of four regions, and the distances between every two of these regions are all 0 or <0, then "|", "-" "|" and "ello" are of one type (as shown by the dashed line box in fig. 8), and the distance between this region of "ello" and the region of "W" is greater than 0, then "W" and "ello" are not of the same type.
Further, for the division based on handwriting time, the acquired handwriting data not only contains point data, but also includes a time stamp of the handwriting, and the specific implementation steps are as follows: acquiring a time interval between adjacent handwriting based on the time stamp; and under the condition that the time interval is less than or equal to the preset interval, determining that the point data of the adjacent handwriting belongs to the same point data set.
Specifically, the timestamp can be obtained by reading the attribute value of the "time" attribute. In the process of writing handwriting, a user may draw a plurality of pieces of handwriting for the same content, and the drawing time of each piece of handwriting is different. The drawing time interval of a plurality of chirographs of the same content is shorter, and the drawing time interval of a plurality of chirographs of different contents is longer. Based on the above principle, for all the acquired times, the time intervals of the times can be calculated, and the point data with the time interval smaller than a certain threshold value is classified into one class (i.e. divided into the same point data set).
Similarly, for the point data of the same class, the data can be input into the recognition model for recognition at one time to obtain the corresponding content. The preset interval may be a threshold value for dividing the point data, and a specific value of the preset interval is determined by analyzing a large amount of handwriting in advance.
In another exemplary embodiment of the embodiments of the present application, the OCR recognition method is as follows: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
OCR is a process of reading out characters printed or written on an intelligence quotient by using a computer technology and converting the characters into a driving process that can be understood by a computer and a person, and an implementation flow of the OCR is as shown in fig. 9, acquiring an image including a character to be recognized, and analyzing a structure; denoising and correcting the object to be detected by using some image processing methods such as threshold operation and the like; because of the special property of the text information, row and column division is needed to detect a single character or a plurality of continuous characters; and introducing the divided character image into a recognition model for processing so as to obtain character information in the original image.
Among the algorithms commonly used in OCR recognition are CTPN, SegLink, EAST, and the like. The CTPN (Detecting Text input natural Image with connected Text forward Network, based on Text detection connected with a preselected frame Network) is a Text detection method based on a convolutional neural Network and a cyclic neural Network, and mainly aims to accurately position Text lines in a picture. As shown in fig. 10, feature maps were obtained with the first 5 convsites of VGG16, with a size of W × H × C; using 3 × 3 sliding window to extract features on the feature map obtained in the previous step, and using the features to predict a plurality of anchors, wherein the anchor definition is the same as that in the previous fast-rcnn, namely helping us define the target candidate area; inputting the features obtained in the last step into a bidirectional LSTM, outputting a W256 result, and inputting the result into a 512-dimensional full connection layer (FC); finally, the output obtained through classification or regression is mainly divided into three parts, 2k vertical coordinates are sequentially arranged from top to bottom according to an upper graph, wherein the coordinates represent the height of the selection box and the y-axis of the center, 2k orders represent the category information of k anchors and indicate whether the selection box is a character, and k side-detail represents the horizontal offset of the selection box; the resulting elongated rectangles are then merged into a sequence box of text using text construction algorithms.
Although the word detection effect of CTPN in natural scenes is good, the detection effect of CTPN is based on the horizontal direction, and is not good for non-horizontal text detection. For such a scenario, a SegLink detection method can be adopted, which can realize multi-angle detection of a rotating text, and the model mainly realizes detection of the text through Segment and Link. As shown in fig. 11, first, feature extraction is performed using VGG16 as a back bone, in which the fully-connected layers (fc6, fc7) of VGG16 replace the convolutional layers (conv6, conv7), and the convolutional layers conv8 to conv11 are connected. The sizes between conv4 and conv11 decrease in turn (each layer is 1/2 of the previous layer). This is done in order to do target detection at multiple scales, i.e. a large feature map is good at detecting small objects, whereas a small feature map is good at detecting large objects. By detecting segment and link from 6 feature layers with the aid of a plurality of feature maps with different scales, text lines with different sizes can be detected.
The CTPN detection method and the SegLink detection method are used for realizing the detection of texts in a mode of predicting propusals (preselected frames) and segment (section), then regressing, combining and the like, and the intermediate process is long. The EAST detection method reduces the intermediate process to only two stages, namely FCN (full convolution network) and NMS (non-maximum suppression), and the output result supports multiple angle detection of text lines and words, which is efficient and accurate and can adapt to various natural application scenarios, as shown in fig. 12.
Based on the principle of OCR, input data of OCR is image pixels, so after point data is acquired, the point data can be drawn into a picture, and different platforms adopt corresponding drawing frames. In the android system, Canvas can be adopted for drawing, a picture can be obtained after drawing is completed, and point data can be drawn into a bitmap object through the following procedures:
Bitmap bitmap=Bitmap.createBitmap(width),height,Bitmap.Config.ARGB_8888);
Canvas canvas=new Canvas(bitmap);
Path path=new Path();
path.moveTo(x,y);
...
canvas.drawPath(path);
and step S408, matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
Specifically, the file content may be matched in the following manner:
// fileName is the fileName that needs to be matched
String regex ═ residual 1| residual 2 |); // regex is a matching rule, and result of recognition is result1, result2, etc
boolean find(String fileName){
Pattern p=Pattern.compile(regex);
return p.matcher(fileName).matches());
}
It should be noted that, for the sake of brevity, the present application is not intended to be exhaustive, and any features that are not mutually inconsistent can be freely combined to form alternative embodiments of the present application.
Example 3
According to the embodiment of the application, the file retrieval method is applied to the intelligent interactive flat panel. As shown in fig. 13, the method includes the steps of:
step S1202, displaying a search interface, where the search interface includes: an input control for inputting a retrieval condition;
step S1204, receiving a retrieval condition input in the input control, wherein the retrieval condition corresponds to the file content of the target whiteboard file;
step S1206, processing at least one whiteboard file to obtain the file content of each whiteboard file;
in this embodiment, the whiteboard file is processed to obtain the following file contents: analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and carrying out content identification on the point data to obtain the file content.
Further, the step of analyzing the whiteboard file to obtain handwriting data includes: reading data of a target label in a whiteboard file; and analyzing the data of the target label to obtain the attribute value of the target attribute in the target label to obtain the handwriting data.
Specifically, one way of performing content identification on point data includes: dividing point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data. Another way of content identification of point data includes: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
Step S1208, matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain a target whiteboard file;
step S1210, a target whiteboard file is displayed.
According to the scheme provided by the embodiment, the retrieval condition input by the user is received by displaying the retrieval interface containing the input control, the retrieval condition input by the user is matched with the file contents of all the whiteboard files, and the target whiteboard file corresponding to the retrieval condition can be obtained and displayed, so that the whiteboard file can be rapidly retrieved, and the retrieval accuracy is high.
Example 4
According to the embodiment of the application, the file retrieval method is applied to the intelligent interactive flat panel. As shown in fig. 14, the method includes the steps of:
step S1302, displaying a retrieval interface;
step S1304, under the condition that the search condition is input in the interactive area of the search interface, matching the search condition with the file content of each whiteboard file;
in this embodiment, the whiteboard file is processed to obtain the following file contents: analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and carrying out content identification on the point data to obtain the file content.
Further, the step of analyzing the whiteboard file to obtain handwriting data includes: reading data of a target label in a whiteboard file; and analyzing the data of the target label to obtain the attribute value of the target attribute in the target label to obtain the handwriting data.
Specifically, one way of performing content identification on point data includes: dividing point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data. Another way of content identification of point data includes: drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
Step 1306, under the condition of successful matching, obtaining a target whiteboard file matched with the retrieval conditions;
step S1308, a target whiteboard file is displayed.
According to the scheme provided by the embodiment, the retrieval conditions input by the user are matched with the file contents of all the whiteboard files by displaying the retrieval interface under the condition that the retrieval conditions are input in the interaction area of the retrieval interface, so that the target whiteboard files corresponding to the retrieval conditions can be obtained and displayed, the rapid retrieval of the whiteboard files can be realized, and the retrieval accuracy is high.
Example 5
The file retrieval method provided by the embodiment of the application can be applied to an intelligent interactive tablet, as shown in fig. 15, the intelligent interactive tablet is connected with a mobile device through a wireless network. Specifically, screen-projecting application software is respectively installed in the intelligent interactive tablet and the mobile device. In this example, after the smart interactive tablet and the mobile device respectively start the screen-projecting application software and establish the data connection, the smart interactive tablet may display the screen-projecting content in the mobile device.
The electronic whiteboard file is installed in the intelligent interactive flat plate, and a user writes and draws on the intelligent interactive flat plate to complete the file content of the whiteboard file and store the whiteboard file. The mobile device may be a cell phone, a notebook computer, a tablet computer, etc.
When a user needs to retrieve the whiteboard file, a retrieval interface as shown in fig. 3 is displayed on the intelligent interaction panel, and an input control is displayed in an interaction area of the retrieval interface. Optionally, the input control is displayed at the upper part of the retrieval interface, and the lower part of the retrieval interface is used for displaying the retrieved whiteboard file. A user can touch the input control to input a retrieval condition, and realize file retrieval by pressing a "retrieval" key, the intelligent interactive tablet processes all the stored whiteboard files to obtain file contents of the whiteboard files, and matches the file contents with the retrieval condition to obtain a target whiteboard file successfully matched, and displays the target whiteboard file as shown in fig. 16.
Example 6
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
As shown in fig. 17, the file retrieving device can be implemented by software, hardware or a combination of both as all or a part of the smart interactive tablet. The apparatus includes a conditional access module 162, a document processing module 164, and a content matching module 166.
A condition receiving module 162, configured to receive a search condition input by a user, where the search condition corresponds to a file content of a target whiteboard file;
the file processing module 164 is configured to process at least one whiteboard file to obtain file content of each whiteboard file;
and the content matching module 166 is configured to match the search condition with the file content of each whiteboard file, and obtain the whiteboard file successfully matched to obtain the target whiteboard file.
On the basis of the foregoing embodiment, the conditional access module includes: the file analysis module is used for analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and the content identification module is used for carrying out content identification on the point data to obtain file content.
On the basis of the foregoing embodiment, the file parsing module includes: the data reading module is used for reading the data of the target tag in the whiteboard file; and the data analysis module is used for analyzing the data of the target label, acquiring the attribute value of the target attribute in the target label and obtaining the handwriting data.
On the basis of the above embodiment, the content identification module includes: the data dividing module is used for dividing the point data to obtain a point data set; the data identification module is used for inputting the point data set into the identification model for identification to obtain file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
On the basis of the embodiment, the handwriting data further includes: the regional data of the region where the handwriting locates, wherein, the data division module includes: the distance acquisition module is used for acquiring the distance between adjacent handwriting based on the area data; and the distance division module is used for determining that the point data of the adjacent handwriting belongs to the same point data set under the condition that the distance is smaller than or equal to the preset distance.
On the basis of the embodiment, the handwriting data further includes: time stamp of handwriting, wherein, the data partitioning module includes: the time acquisition module is used for acquiring the time interval between adjacent handwriting based on the time stamp; and the time division module is used for determining that the point data of the adjacent handwriting belongs to the same point data set under the condition that the time interval is less than or equal to the preset interval.
On the basis of the above embodiment, the content identification module includes: the picture drawing module is used for drawing the point data into a picture; and the picture identification module is used for carrying out optical character identification on the picture to obtain the file content.
On the basis of the foregoing embodiment, the conditional access module at least includes one of the following: the text receiving module is used for receiving text data input by a user to obtain a retrieval condition; and the voice receiving module is used for receiving voice data input by a user and identifying the voice data to obtain a retrieval condition.
On the basis of the present embodiment, in the case where the file content of the target whiteboard file contains a plurality of elements, the search condition contains at least one element.
It should be noted that, when the file retrieval apparatus provided in the foregoing embodiment executes the file retrieval method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. In addition, the file retrieval device and the file retrieval method provided by the above embodiments belong to the same concept, and the detailed implementation process thereof is referred to as the method embodiment, which is not described herein again.
Example 7
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
As shown in fig. 18, the file retrieval device may be implemented as all or a part of the smart interactive tablet through software, hardware or a combination of both. The apparatus includes an interface display module 170, a condition receiving module 172, a document processing module 174, a content matching module 176, and a document display module 178.
An interface display module 170, configured to display a retrieval interface, where the retrieval interface includes: an input control for inputting a retrieval condition;
a condition receiving module 172, configured to receive a search condition input in the input control, where the search condition corresponds to a file content of the target whiteboard file;
the file processing module 174 is configured to process at least one whiteboard file to obtain file content of each whiteboard file;
the content matching module 176 is configured to match the search condition with the file content of each whiteboard file, and obtain a whiteboard file successfully matched to obtain a target whiteboard file;
and a file display module 178 for displaying the target whiteboard file.
It should be noted that, when the file retrieval apparatus provided in the foregoing embodiment executes the file retrieval method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. In addition, the file retrieval device and the file retrieval method provided by the above embodiments belong to the same concept, and the detailed implementation process thereof is referred to as the method embodiment, which is not described herein again.
Example 8
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
As shown in fig. 19, the file retrieving device can be implemented by software, hardware or a combination of both as all or a part of the smart interactive tablet. The apparatus includes an interface display module 182, a content matching module 184, a file acquisition module 186, and a file display module 188.
An interface display module 182, configured to display a retrieval interface;
the content matching module 184 is configured to match the search condition with the file content of each whiteboard file when the search condition is input in the interactive area of the search interface;
the file obtaining module 186 is configured to, in a case that matching is successful, obtain a target whiteboard file matched with the retrieval condition;
and a file display module 188, configured to display the target whiteboard file.
It should be noted that, when the file retrieval apparatus provided in the foregoing embodiment executes the file retrieval method, only the division of the functional modules is illustrated, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. In addition, the file retrieval device and the file retrieval method provided by the above embodiments belong to the same concept, and the detailed implementation process thereof is referred to as the method embodiment, which is not described herein again.
Example 9
The embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the method steps in the embodiments shown in fig. 1 to 14, and a specific execution process may refer to specific descriptions of the embodiments shown in fig. 1 to 14, which is not described herein again.
The device on which the storage medium is located may be a smart interactive tablet.
Example 10
As shown in fig. 20, smart interaction tablet 1900 may include: at least one processor 1901, at least one network interface 1904, a user interface 1903, a memory 1905, and at least one communication bus 1902.
The communication bus 1902 is used for implementing connection communication among these components.
The user interface 1903 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1903 may also include a standard wired interface and a wireless interface.
The network interface 1904 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.
The processor 1901 may include one or more processing cores, among others. The processor 1901 connects various parts throughout the intelligent interaction panel 1900 using various interfaces and lines, and performs various functions of the intelligent interaction panel 1900 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1905, and calling data stored in the memory 1905. Alternatively, the processor 1901 may be implemented in at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 1901 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It is to be understood that the modem may not be integrated into the processor 1901, but may be implemented by a single chip.
The Memory 1905 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1905 includes a non-transitory computer-readable medium. The memory 1905 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1905 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data and the like referred to in the above respective method embodiments. The memory 1905 may alternatively be at least one memory device located remotely from the processor 1901. As shown in fig. 20, the memory 1905, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and an operating application of the smart interactive tablet.
In the smart interaction tablet 1900 shown in fig. 20, the user interface 1903 is mainly used as an interface for providing input for a user, and acquiring data input by the user; and the processor 1901 may be configured to call an operating application of the smart interactive tablet stored in the memory 1905, and specifically perform the following operations:
receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of the target whiteboard file; processing at least one whiteboard file to obtain the file content of each whiteboard file; and matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
In one embodiment, the operating system of the smart interactive tablet is an android system, in which the processor 1901 further performs the following steps:
analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting; and carrying out content identification on the point data to obtain the file content.
In one embodiment, the processor 1901 further performs the following steps:
reading data of a target label in a whiteboard file; and analyzing the data of the target label to obtain the attribute value of the target attribute in the target label to obtain the handwriting data.
In one embodiment, the processor 1901 further performs the following steps:
dividing point data to obtain a point data set; inputting the point data set into an identification model for identification to obtain file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
In one embodiment, the processor 1901 further performs the following steps:
acquiring the distance between adjacent handwriting based on the area data; and under the condition that the distance is smaller than or equal to the preset distance, determining that the point data of the adjacent handwriting belong to the same point data set.
In one embodiment, the processor 1901 further performs the following steps:
acquiring a time interval between adjacent handwriting based on the time stamp; and under the condition that the time interval is less than or equal to the preset interval, determining that the point data of the adjacent handwriting belongs to the same point data set.
In one embodiment, the processor 1901 further performs the following steps:
drawing the point data into a picture; and carrying out optical character recognition on the picture to obtain the file content.
In one embodiment, the processor 1901 further performs the following steps:
receiving text data input by a user to obtain a retrieval condition; and receiving voice data input by a user, and identifying the voice data to obtain a retrieval condition.
By matching the retrieval conditions input by the user with the file contents of all the whiteboard files, the target whiteboard file corresponding to the retrieval conditions can be obtained, so that the whiteboard file can be quickly retrieved, and the retrieval accuracy is high.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method of retrieving a document, comprising:
receiving a retrieval condition input by a user, wherein the retrieval condition corresponds to the file content of a target whiteboard file;
processing at least one whiteboard file to obtain the file content of each whiteboard file;
and matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
2. The method of claim 1, wherein processing at least one whiteboard file to obtain file contents of each whiteboard file comprises:
analyzing each whiteboard file to obtain handwriting data stored in the whiteboard file, wherein the handwriting data at least comprises: dot data of handwriting;
and performing content identification on the point data to obtain the file content.
3. The method of claim 2, wherein parsing each whiteboard file to obtain handwriting data stored in the whiteboard file comprises:
reading the data of the target tag in the whiteboard file;
and analyzing the data of the target label to obtain an attribute value of the target attribute in the target label to obtain the handwriting data.
4. The method of claim 2, wherein identifying the content of the point data, and obtaining the file content comprises:
dividing the point data to obtain a point data set;
inputting the point data set into an identification model for identification to obtain the file content, wherein the identification model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the point data and the contents corresponding to the point data.
5. The method according to claim 1, wherein in the case where the file content of the target whiteboard file contains a plurality of elements, the retrieval condition contains at least one element.
6. A method of retrieving a document, comprising:
displaying a retrieval interface, wherein the retrieval interface comprises: an input control for inputting a retrieval condition;
receiving a retrieval condition input in the input control, wherein the retrieval condition corresponds to the file content of the target whiteboard file;
processing at least one whiteboard file to obtain the file content of each whiteboard file;
matching the retrieval conditions with the file content of each whiteboard file, and acquiring the whiteboard files successfully matched to obtain the target whiteboard file;
and displaying the target whiteboard file.
7. A method of retrieving a document, comprising:
displaying a retrieval interface;
under the condition that a retrieval condition is input in an interaction area of the retrieval interface, matching the retrieval condition with the file content of each whiteboard file;
under the condition of successful matching, obtaining a target whiteboard file matched with the retrieval condition;
and displaying the target whiteboard file.
8. A document retrieval apparatus, comprising:
the system comprises a condition receiving module, a condition processing module and a condition processing module, wherein the condition receiving module is used for receiving a retrieval condition input by a user, and the retrieval condition corresponds to the file content of a target whiteboard file;
the file processing module is used for processing at least one whiteboard file to obtain the file content of each whiteboard file;
and the content matching module is used for matching the retrieval conditions with the file content of each whiteboard file and acquiring the whiteboard files successfully matched to obtain the target whiteboard file.
9. A computer storage medium, characterized in that it stores a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any of claims 1 to 7.
10. An intelligent interactive tablet, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1 to 7.
CN201910989351.2A 2019-10-17 2019-10-17 File retrieval method and device, storage medium and related equipment Pending CN110750501A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910989351.2A CN110750501A (en) 2019-10-17 2019-10-17 File retrieval method and device, storage medium and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910989351.2A CN110750501A (en) 2019-10-17 2019-10-17 File retrieval method and device, storage medium and related equipment

Publications (1)

Publication Number Publication Date
CN110750501A true CN110750501A (en) 2020-02-04

Family

ID=69278801

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910989351.2A Pending CN110750501A (en) 2019-10-17 2019-10-17 File retrieval method and device, storage medium and related equipment

Country Status (1)

Country Link
CN (1) CN110750501A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563394A (en) * 2022-11-24 2023-01-03 腾讯科技(深圳)有限公司 Search recall method, recall model training method, device and computer equipment
WO2024113271A1 (en) * 2022-11-30 2024-06-06 京东方科技集团股份有限公司 Intelligent handwriting display device, intelligent handwriting display method, and electronic device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017068399A (en) * 2015-09-29 2017-04-06 日本電気株式会社 Information processing apparatus, search method of electronic whiteboard, and program
CN108681549A (en) * 2018-03-29 2018-10-19 广州视源电子科技股份有限公司 Method and device for acquiring multimedia resources
CN108763416A (en) * 2018-05-23 2018-11-06 广州视源电子科技股份有限公司 Multimedia file display method and device, computer equipment and storage medium
CN108920550A (en) * 2018-06-15 2018-11-30 广州视源电子科技股份有限公司 file searching method and device
CN109242309A (en) * 2018-09-05 2019-01-18 广州视源电子科技股份有限公司 Participated user portrait generation method and device, intelligent conference equipment and storage medium
CN109784151A (en) * 2018-12-10 2019-05-21 重庆邮电大学 A kind of Off-line Handwritten Chinese Recognition method based on convolutional neural networks
CN109977737A (en) * 2017-12-28 2019-07-05 新岸线(北京)科技集团有限公司 A kind of character recognition Robust Method based on Recognition with Recurrent Neural Network
CN110188671A (en) * 2019-05-29 2019-08-30 济南浪潮高新科技投资发展有限公司 A method of handwriting characteristic is analyzed using machine learning algorithm

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017068399A (en) * 2015-09-29 2017-04-06 日本電気株式会社 Information processing apparatus, search method of electronic whiteboard, and program
CN109977737A (en) * 2017-12-28 2019-07-05 新岸线(北京)科技集团有限公司 A kind of character recognition Robust Method based on Recognition with Recurrent Neural Network
CN108681549A (en) * 2018-03-29 2018-10-19 广州视源电子科技股份有限公司 Method and device for acquiring multimedia resources
CN108763416A (en) * 2018-05-23 2018-11-06 广州视源电子科技股份有限公司 Multimedia file display method and device, computer equipment and storage medium
CN108920550A (en) * 2018-06-15 2018-11-30 广州视源电子科技股份有限公司 file searching method and device
CN109242309A (en) * 2018-09-05 2019-01-18 广州视源电子科技股份有限公司 Participated user portrait generation method and device, intelligent conference equipment and storage medium
CN109784151A (en) * 2018-12-10 2019-05-21 重庆邮电大学 A kind of Off-line Handwritten Chinese Recognition method based on convolutional neural networks
CN110188671A (en) * 2019-05-29 2019-08-30 济南浪潮高新科技投资发展有限公司 A method of handwriting characteristic is analyzed using machine learning algorithm

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563394A (en) * 2022-11-24 2023-01-03 腾讯科技(深圳)有限公司 Search recall method, recall model training method, device and computer equipment
CN115563394B (en) * 2022-11-24 2023-03-28 腾讯科技(深圳)有限公司 Search recall method, recall model training method, device and computer equipment
WO2024113271A1 (en) * 2022-11-30 2024-06-06 京东方科技集团股份有限公司 Intelligent handwriting display device, intelligent handwriting display method, and electronic device

Similar Documents

Publication Publication Date Title
US10360473B2 (en) User interface creation from screenshots
Nguyen et al. Reverse engineering mobile application user interfaces with remaui (t)
US10437466B2 (en) Formula inputting method and apparatus
US10032072B1 (en) Text recognition and localization with deep learning
CN109947967B (en) Image recognition method, image recognition device, storage medium and computer equipment
US8196066B1 (en) Collaborative gesture-based input language
CN112507806B (en) Intelligent classroom information interaction method and device and electronic equipment
CN112070076B (en) Text paragraph structure reduction method, device, equipment and computer storage medium
US8149281B2 (en) Electronic device and method for operating a presentation application file
CN111752557A (en) Display method and device
CN106293074A (en) A kind of Emotion identification method and mobile terminal
KR102075433B1 (en) Handwriting input apparatus and control method thereof
US11914951B2 (en) Semantically-guided template generation from image content
US20230367473A1 (en) Ink data generation apparatus, method, and program
CN113516113A (en) Image content identification method, device, equipment and storage medium
CN103389873A (en) Electronic device, and handwritten document display method
CN110363190A (en) A kind of character recognition method, device and equipment
CN114067797A (en) Voice control method, device, equipment and computer storage medium
CN110750501A (en) File retrieval method and device, storage medium and related equipment
CN113449726A (en) Character comparison and identification method and device
US10430458B2 (en) Automated data extraction from a chart from user screen selections
CN113486171B (en) Image processing method and device and electronic equipment
CN115775386A (en) User interface component identification method and device, computer equipment and storage medium
US20210073458A1 (en) Comic data display system, method, and program
CN116071767A (en) Table identification reconstruction method, apparatus, storage medium and interactive flat panel

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200204