CN114281236A - Text processing method, device, equipment, medium and program product - Google Patents

Text processing method, device, equipment, medium and program product Download PDF

Info

Publication number
CN114281236A
CN114281236A CN202111631947.9A CN202111631947A CN114281236A CN 114281236 A CN114281236 A CN 114281236A CN 202111631947 A CN202111631947 A CN 202111631947A CN 114281236 A CN114281236 A CN 114281236A
Authority
CN
China
Prior art keywords
text
target
state
face
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111631947.9A
Other languages
Chinese (zh)
Other versions
CN114281236B (en
Inventor
杨金明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202111631947.9A priority Critical patent/CN114281236B/en
Publication of CN114281236A publication Critical patent/CN114281236A/en
Application granted granted Critical
Publication of CN114281236B publication Critical patent/CN114281236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a text processing method, a text processing device, text processing equipment, a text processing medium and a program product, and relates to the technical field of intelligent terminals, wherein the method comprises the following steps: the method comprises the steps of obtaining the facial state of a target object, determining whether the facial state is matched with at least one preset state or not, wherein the facial state comprises at least one of eye muscle state, eyelid state, eye eyebrow state and eye opening state, obtaining the position of a sight line focus of the target object in a display interface when the facial information is matched with at least one preset facial information, selecting a target text from text information according to the position, adjusting parameters of the target text and/or broadcasting the target text in a voice mode, and the parameters of the target text at least comprise at least one of the word size, the font, the color, the background color and the brightness of the target text. According to the technical scheme, the reading effect can be further improved by automatically adjusting the parameters of the target text and/or broadcasting the target text by voice.

Description

Text processing method, device, equipment, medium and program product
Technical Field
The present application relates to the field of intelligent terminal technologies, and in particular, to a text processing method, apparatus, device, medium, and program product.
Background
With the wide use of terminal devices, more and more users read the displayed text information through the terminal devices, but some users have poor eyesight, for example, myopia needs to wear glasses to see distant objects in daily life, and the terminal devices are often close to the users due to the convenience in carrying. However, the distance is difficult to control for the myope, if the distance is far, the user needs to wear the glasses, and if the distance is near, the user can take off the glasses, and with the change of the distance, the myope may need to frequently take off the glasses, which is inconvenient for reading.
In the prior art, in order to solve the situation, an amplification function is mainly provided on a terminal device, and a user can zoom a text through the amplification function, so that the problem that glasses need to be frequently picked up in the reading process is solved.
However, in the prior art, the user needs to use the magnifying function for a long time, the operation is very troublesome, the use is not flexible enough, and a good reading effect cannot be provided.
Disclosure of Invention
The application provides a text processing method, a text processing device, text processing equipment, a text processing medium and a text processing program product, which are used for solving the problem that the conventional terminal equipment cannot provide a good reading effect for users with poor eyesight.
In a first aspect, an embodiment of the present application provides a text processing method, which is applied to a terminal device, where the terminal device includes a display interface for displaying text information, and the method includes:
acquiring a facial state of a target object, and determining whether the facial state is matched with at least one preset state, wherein the facial state comprises at least one of an eye muscle state, an eyelid state, an eyebrow state and an eye opening state;
when the face information is matched with at least one preset face information, acquiring a current face image of the target object, and determining key coordinate points and auxiliary coordinate points in the face image, wherein the key coordinate points comprise coordinate points of the nose tip and eyes of the target object, and the auxiliary coordinate points comprise coordinate points of the eyebrows, the mouth, the ears, the forehead and the chin of the target object;
determining a three-dimensional model of the face of the target object according to the key coordinate points and the auxiliary coordinate points;
determining the position of the sight focus of the target object in the display interface according to the three-dimensional model of the face, the distance between the current face of the target object and the display interface and the pupil angles of two eyes in the current face;
selecting a target text from the text information according to the position;
and adjusting the parameters of the target text and/or broadcasting the target text in a voice mode, wherein the parameters of the target text at least comprise at least one of the font size, font style, color, background color and brightness of the target text.
In a possible design of the first aspect, the determining, according to the three-dimensional model of the face, a distance between a current face of the target object and a display interface, and pupil angles of two eyes in the current face, a position of a gaze focus of the target object in the display interface includes:
acquiring the projection position of the nose tip on the display interface;
determining a first projection position of a pupil in a left eye and a second projection position of a pupil in a right eye of the target object on the display interface according to the three-dimensional facial model and the projection position of the nose tip;
acquiring an intersection area of the first projection position and the second projection position as the position.
In another possible design of the first aspect, the selecting a target text from the text information according to the position includes:
determining the position of a punctuation mark in the text information, wherein the text information comprises at least one punctuation mark;
and selecting a target text from the text information according to the relative distance between the position of the sight focus and the position of the punctuation mark.
In yet another possible design of the first aspect, the adjusting the parameter of the target text includes:
displaying at least two different test texts on the display interface, wherein the test parameters of each test text are different, and the test parameters comprise at least one of a font size, a font style, a color, a background color and a brightness;
acquiring an input text selection instruction, and selecting a target test text from the at least two different test texts;
and adjusting the parameters of the target text according to the test parameters of the target test text.
In yet another possible design of the first aspect, the adjusting the parameter of the target text includes:
displaying a preset vision test table on the display interface and outputting second prompt information, wherein the second prompt information is used for instructing a user to input a corresponding vision test result according to the preset vision test table;
determining a magnification coefficient according to the vision test result and the distance between the current face of the user and the display interface;
and amplifying the target text according to the amplification factor.
In yet another possible design of the first aspect, the method further includes:
responding to received request information, restoring the parameters of the target text to preset initial values and/or stopping broadcasting the target text, wherein the request information comprises at least one of voice information and instructions input to the terminal equipment
In yet another possible design of the first aspect, the method further includes:
adjusting parameters of the target text and/or broadcasting the target text in voice every time, and storing the facial state of the target object to a target storage space;
counting the number of face states with the same state in the target storage space;
when the number of face states with the same state reaches a preset threshold value, updating the preset state to the same face state.
In a second aspect, an embodiment of the present application provides a text processing apparatus, including:
the matching module is used for acquiring the facial state of the target object and determining whether the facial state is matched with at least one preset state, wherein the facial state comprises at least one of eye muscle state, eyelid state, eyebrow state and eye opening state;
the acquisition module is used for acquiring the position of the sight focus of the target object in a display interface when the face information is matched with at least one preset face information;
the selecting module is used for selecting a target text from the text information according to the position;
and the adjusting module is used for adjusting the parameters of the target text and/or broadcasting the target text in a voice mode, wherein the parameters of the target text at least comprise at least one of the font size, font style, color, background color and brightness of the target text.
In a third aspect, an embodiment of the present application provides a terminal device, including: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes the computer-executable instructions stored by the memory to implement the methods described above.
In a fourth aspect, the present application provides a readable storage medium, in which computer instructions are stored, and when executed by a processor, the computer instructions are used to implement the method described above.
In a fifth aspect, the present application provides a program product including computer instructions, which when executed by a processor implement the method described above.
According to the text processing method, the text processing device, the text processing equipment, the text processing medium and the text processing program product, by acquiring the facial state of the user, if the situation that the user has difficulty in reading is determined according to the facial state of the current user, parameters of the target text can be automatically adjusted and/or the target text can be automatically broadcasted in a voice mode, and the reading effect can be further improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application;
fig. 1 is a scene schematic diagram of a text processing method provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of a first embodiment of a text processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a second embodiment of a text processing method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms referred to in this application are explained first:
structured light:
structured light is a set of system structures consisting of a projector and a camera. The projector is used for projecting specific light information to the surface of an object and the background, and the specific light information is collected by the camera. Information such as the position and depth of the object is calculated from the change of the optical signal caused by the object, and the entire three-dimensional space is restored.
Face recognition:
face recognition is a biometric technology for identity recognition based on facial feature information of a person. A series of related technologies, also commonly called face recognition and face recognition, are used to collect images or video streams containing faces by using a camera or a video camera, automatically detect and track the faces in the images, and then perform face recognition on the detected faces.
And (3) expression recognition:
expression recognition, which refers to separating a specific expression state from a given still image or dynamic video sequence, thereby determining the psychological mood of the recognized object.
Machine learning:
machine learning is a multi-field cross discipline and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer.
Fig. 1 is a scene schematic diagram of a text processing method according to an embodiment of the present application, and as shown in fig. 1, when a user uses a terminal device, such as a mobile phone 10, two lines of characters are displayed on a display interface of the mobile phone 10. When a user reads the characters, the user may not be able to see the characters in the characters due to the eyesight of the user and the distance between the display interface and the face of the user, and the user needs to be close to the display interface to see the contents clearly, which is inconvenient for the user to use. In order to avoid the user viewing the content in the near vicinity, the mobile phone 10 needs to automatically enlarge the text being viewed by the user, for example, when the user is viewing the second row of text, the second row of text will be enlarged by a certain factor, so that the user can see clearly.
Before the mobile phone 10 automatically magnifies the text, it is necessary to detect whether the user has a reading difficulty, for example, in the prior art, a front camera of the mobile phone 10 may collect an expression of the user, such as squinting. When the user is squinting, the automatic amplification function is triggered, and the mobile phone 10 amplifies all texts in the display interface, or only amplifies the texts which the user is looking up according to the actual size of the display interface. The mode in the prior art can relieve the reading obstacle of the user to a certain extent, but the effect of the mode is not ideal enough, and the situation of wrong identification is possible, and whether the user has reading difficulty or not cannot be accurately determined. In addition, the prior art mainly amplifies the whole font, but the whole view is damaged, so that the picture is deformed and distorted, the display of the character paragraphs is disordered, and the reading obstacle is caused.
For the above text, the text processing method, apparatus, device, medium, and program product provided in this embodiment of the present application obtain the facial state of the user, specifically including at least one of the eye muscle state, the eyelid state, the eyebrow state, and the eye opening state, and if it is determined that the user has difficulty in reading according to the current facial state of the user, automatically adjust the parameters of the target text and/or broadcast the target text in a voice manner, so as to further improve the reading effect.
The technical solution of the present application will be described in detail below with reference to specific examples. It should be noted that the following specific embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 2 is a schematic flowchart of a first embodiment of a text processing method provided in the embodiment of the present application, where the method may be applied to a terminal device, and the terminal device at least includes a display interface, a camera or a structured light for acquiring a face state of a user, and the like. Illustratively, the terminal equipment can be a mobile phone, a computer and the like. As shown in fig. 2, the method may specifically include the following steps:
s201, acquiring the face state of the target object, and determining whether the face state is matched with at least one preset state.
Wherein the facial state includes at least one of an eye muscle state, an eyelid state, an eyebrow state, and an eye-open state.
In this embodiment, the preset states may be four states of eye muscle contraction, eyelid contraction, frown, and eye opening greater than normal. The target object may be a user, and when the user currently has a facial state that matches a preset state, it is determined that the user has a reading obstacle.
For example, the terminal device may collect the facial state of the user through a front camera, structured light, and the like.
In some embodiments, the terminal device may further acquire the gazing durations of the two eyes of the user at the same position on the display interface, and if the gazing durations exceed a preset threshold, it may also be determined that the user has a reading obstacle, and the process may proceed to the subsequent step S202.
Optionally, in some other embodiments, the distance between the face of the user and the display interface of the terminal device may also be collected, and when the distance exceeds a preset distance value, it may also be determined that the user has a reading obstacle, and the process may proceed to the subsequent step S202.
Optionally, in some other embodiments, a control instruction input by the user may also be collected, for example, a zoom-in instruction, and if it is detected that the user inputs the zoom-in instruction, it may also be determined that the user has a reading obstacle, and the process may proceed to the subsequent step S202.
S202, when the face information is matched with at least one preset face information, acquiring a current face image of the target object, and determining a key coordinate point and an auxiliary coordinate point in the face image.
The key coordinate points comprise coordinate points of the nose tip and the eyes of the target object, and the auxiliary coordinate points comprise coordinate points of the eyebrows, the mouths, the ears, the forehead and the chin of the target object.
And S203, determining a three-dimensional face model of the target object according to the key coordinate points and the auxiliary coordinate points.
In this embodiment, the identification of each coordinate point can be located by analyzing the geometric position thereof through a face detection algorithm. For example, the eyebrow is taken as an example, the position of the eyebrow can be detected by a face detection algorithm, so that the coordinate point of the eyebrow is obtained.
S204, determining the position of the sight focus of the target object in the display interface according to the three-dimensional model of the face, the distance between the current face of the target object and the display interface and the pupil angles of the two eyes in the current face.
In this embodiment, the distance between the current face of the target object and the display interface may be determined according to a principle of distance, and specifically, the size of the image of the face of the target object and the complete information are collected by a camera of the terminal device, where the closer the distance is, the larger the collected face image is, the more incomplete the information is (for example, the ear is lacked), and the farther the distance is, the smaller the collected face image is, and the more complete the information is.
In this embodiment, the position of the user's gaze focus in the display interface indicates the area the user is currently reading. The user's gaze focus may move in the display interface, and thus needs to be monitored in real time.
For example, the position of the user's gaze focus in the display interface may be captured by an eye tracker.
In this embodiment, usually, text information is displayed at a position of a focus of a user in a display interface, where the text information may include non-dynamic files such as a text class and a picture class, where the text class file may be a text Document, a word Document, and the like, and the picture class file may be a Portable Document Format (PDF) or Joint Photographic Experts Group (JPEG).
And S205, selecting a target text from the text information according to the position.
In this embodiment, if the user currently gazes at a picture, that is, the position of the focus of the gaze is on a certain picture, the picture may be used as the target text. For example, if the user currently gazes at a word, a paragraph or sentence at which the gazing focus is located may be found, and the paragraph or sentence is taken as the target text.
And S206, adjusting the parameters of the target text and/or broadcasting the target text by voice.
Wherein the parameters of the target text at least comprise at least one of the font size, font style, color, background color and brightness of the target text.
For example, if the target text is a word, the word may be converted into speech and played for the user to listen to. If the target text is a picture, the picture may be enlarged, the color of the picture may be changed, the brightness of the picture may be increased, or the background color of the picture may be adjusted. For example, if the target text is a word, the size of the word may be adjusted to be larger, the font may be adjusted to be bolded, and so on.
Optionally, in some embodiments, if the target text is a long text (e.g., a paragraph), the text may be converted to a voice announcement. Specifically, voice broadcasting can be started from the position of the user sight focus.
In the voice broadcasting process, whether the end of the text content is reached can be detected, for example, the end of a segment or a sentence is reached, and if the end is reached, the voice broadcasting can be stopped.
For example, a user may set the terminal device, for example, may set to simultaneously adjust parameters of a target text and broadcast the target text in voice, may also set to preferentially adjust parameters of the target text, and may also set to preferentially broadcast the target text in voice.
According to the embodiment of the application, by acquiring the facial state of the user, if the situation that the user has difficulty in reading is determined according to the facial state of the current user, parameters of the target text can be automatically adjusted and/or the target text can be automatically broadcasted through voice, and the reading effect can be further improved.
In some embodiments, the step S204 may specifically include the following steps:
determining the projection position of the nose tip on the display interface according to the distance between the current face of the target object and the display interface;
determining a first projection position of a pupil in a left eye and a second projection position of a pupil in a right eye of the target object on the display interface according to the three-dimensional model of the face and the projection position of the nose tip;
an intersection region of the first projection position and the second projection position is acquired as a position.
In this embodiment, the projection of the nose tip perpendicular to the display interface may be used as a midpoint, and the projection position of the eye on the display interface may be determined according to the three-dimensional model of the face, so as to be used as the real-time attention range of the user. And then determining the projection position of the pupil on the display interface according to the color of the white eye and the pupil, the light reflection condition and the like. Taking the geometric center of the eyes as a coordinate O by taking the eyes as an active area, judging the displacement condition of the two eyes on the display interface, and taking the projection intersection area of the pupils of the two eyes on the display interface as a position, namely the sight focus of the user.
According to the embodiment of the application, the position of the sight focus of the user in the display interface can be accurately found by constructing the three-dimensional facial model of the user and combining the distance between the current face of the user and the display interface and the through hole angles of the two eyes in the current face, so that parameter adjustment or voice broadcasting can be carried out on the target text at the position, and the reading effect of the user is improved.
In some embodiments, the step S205 may be specifically implemented by the following steps:
determining the position of a punctuation mark in the text information;
and selecting a target text from the text information according to the relative distance between the position of the sight focus and the position of the punctuation mark.
Wherein the text information comprises at least one punctuation mark. Illustratively, the text information may also include words.
In the present embodiment, the punctuation marks may include commas, periods, and the like. For example, when the relative distance between the sight line focus and a certain period is lower than a preset threshold, for example, the distance between the sight line focus and a certain period is only one character, a sentence after the period may be selected as the target text.
For example, the target text may also include a plurality of sentences, and the number of sentences included in the target text may be determined according to the size of the display interface, for example, when the size of the display interface is large enough to enlarge three sentences without overflowing the display interface, then three sentences may be included in the target text.
The distance between the sight focus and the punctuation mark can be used to determine which sentence the sight focus is currently located in, and then the last sentence and the next sentence of the sentence are found, and the three sentences are taken as target texts.
According to the method and the device, the more accurate target texts can be selected from the text information as the texts currently looked up by the user by determining the relative characters of the punctuation marks and the sight focus, and the user can obtain a better reading effect when the parameters of the target texts are subsequently adjusted or the target texts are broadcasted in a voice mode.
In some embodiments, the "adjusting the parameter of the target text" in step S206 may be specifically implemented by the following steps:
displaying at least two different test texts on a display interface;
acquiring an input text selection instruction, and selecting a target test text from at least two different test texts;
and adjusting the parameters of the target text according to the test parameters of the target test text.
The test parameters of each test text are different, and the test parameters comprise at least one of font size, font style, color, background color and brightness.
In this embodiment, different test texts may be ranked, for example, first-level, second-level, third-level, and so on, and the font size, font, color, background color, and brightness included in the test texts in different levels may be different. For example, the first test text has the largest font size, the second test text has the second font size, and the third test text has the smallest font size.
For example, after at least two different test texts are displayed on the display interface, a voice prompt may be output to prompt the user to select a preferred test text as the target test text. After the target test text is determined, if parameter adjustment needs to be carried out on the target text subsequently, the parameters of the target text can be adjusted directly according to the test parameters of the target test text.
According to the embodiment of the application, the test texts with various different test parameters are set for the user to select, the parameters of the target text can be adjusted according to the test parameters preferred by the user, and the reading effect of the user is improved.
In some embodiments, the "adjusting the parameters of the target text" in the step 206 may be specifically implemented by the following steps:
displaying a preset vision test chart on a display interface and outputting second prompt information;
determining an amplification factor according to the vision test result and the distance between the current face of the user and the display interface;
and amplifying the target text according to the amplification factor.
And the second prompt information is used for indicating the user to input a corresponding vision test result according to a preset vision test table.
For example, the preset vision test chart may be an existing test chart, and when the terminal device displays the preset vision test chart, the terminal device may output second prompt information to prompt the user to check which letter in the vision test chart, and then the user feeds back the check result to determine the vision test result of the user.
For example, in some other embodiments, the user may also directly input his or her own vision as a vision test result.
According to the embodiment of the application, the vision condition of the user is obtained, and then the size of the target text can be adaptively adjusted according to the distance between the current face of the user and the display interface, so that the target text can be adjusted more flexibly, and the reading effect of the user is improved.
In some embodiments, the text processing method may further include the steps of:
and responding to the received request information, and restoring the parameters of the target text to a preset initial value and/or stopping playing the broadcast target text.
Wherein the request information includes at least one of voice information and an instruction input to the terminal device. Illustratively, the voice information may be a voice uttered by the user, such as "feature off. The instruction may be a key instruction input by the user on the terminal device.
In some embodiments, the above method further comprises the steps of:
when the target text is amplified and/or broadcasted in a voice mode every time, the facial state of the target object is stored in a target storage space;
counting the number of face states with the same state in a target storage space;
when the number of face states having the same state reaches a preset threshold value, the preset state is updated to the same face state.
For example, the terminal device may default to a reading-difficult expression for more than three occurrences of the facial state by analyzing the captured facial state of the user when zooming the screen content with the finger.
For example, when the user may have a frown when reading difficulty, the user may adjust the parameters of the target text and/or voice-broadcast the target text, taking the facial state as the frown as an example. At this time, a frown is recorded once, and the same facial state (i.e., frown) is reached twice if the user again has frown when reading difficulty occurs, and the same facial state (i.e., frown) is reached three times if the user again has frown when reading difficulty occurs.
The preset threshold may be three times, and when the frown appears three times, the preset threshold indicates that the user may be biased to the face state of the frown when the reading difficulty occurs. Namely, the surface state of the frown is used as a judgment basis. If the user also has frown next time, it can be directly determined that the user has reading difficulty.
Optionally, in some embodiments, the terminal device may further capture a voice uttered by the user when the reading difficulty occurs, and if the number of times of occurrence of the same voice exceeds a preset threshold, it may be determined that the language may be used as a basis for determining that the reading difficulty occurs to the user. The next time the user utters the voice, it indicates that the user has difficulty reading.
According to the embodiment of the application, the preset state is updated, the situation that the reading difficulty of the user occurs can be captured more accurately, and the accuracy of judging the reading difficulty of the user is improved.
Fig. 3 is a schematic flowchart of a second embodiment of a text processing method provided in an embodiment of the present application, and as shown in fig. 3, the method relates to multi-end interaction, and specifically includes the following steps:
s301, monitoring the face state and sight of a user;
s302, the user is difficult to read, and the face state is in a preset state;
s303, triggering a text processing function;
s304, tracking the sight of the user;
s305, forming a magnifying lens to magnify the text and the picture at the sight focusing part;
s306, converting the characters into voice to be played;
and S307, finishing checking by the user and closing the text processing function.
And S308, closing the text processing function.
Wherein the camera, the screen and the processor can be integrated on the terminal device.
The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.
Fig. 4 is a schematic structural diagram of a text processing apparatus according to an embodiment of the present application, where the text processing apparatus may be integrated on a terminal device, or may be independent of the terminal device and cooperate with the terminal device to implement the technical solution. As shown in fig. 4, the text processing apparatus 40 may specifically include a matching module 41, an obtaining module 42, a model determining module 43, a position determining module 44, a selecting module 45, and an adjusting module 46.
The matching module 41 is configured to obtain a face state of the target object, and determine whether the face state matches at least one preset state. The obtaining module 42 is configured to obtain a current face image of the target object when the face information matches at least one preset face information, and determine the key coordinate points and the auxiliary coordinate points in the face image. The model determining module 43 is configured to determine a three-dimensional model of the face of the target object according to the key coordinate points and the auxiliary coordinate points. The position determining module 44 is configured to determine a position of a gaze focus of the target object in the display interface according to the three-dimensional model of the face, a distance between a current face of the target object and the display interface, and pupil angles of two eyes in the current face. The selecting module 45 is configured to select a target text from the text information according to the position. The adjusting module 46 is used for adjusting parameters of the target text and/or voice broadcasting the target text.
The facial state comprises at least one of an eye muscle state, an eyelid state, an eyebrow state and an eye opening state, the key coordinate points comprise coordinate points of the nose tip and the eyes of the target object, the auxiliary coordinate points comprise coordinate points of the eyebrows, the mouths, the ears, the forehead and the chin of the target object, and the parameters of the target text at least comprise at least one of the font size, the font style, the color, the background color and the brightness of the target text.
In some embodiments, the location determination module may be specifically configured to:
determining the projection position of the nose tip on the display interface according to the distance between the current face of the target object and the display interface;
determining a first projection position of a pupil in a left eye and a second projection position of a pupil in a right eye of the target object on the display interface according to the three-dimensional model of the face and the projection position of the nose tip;
an intersection region of the first projection position and the second projection position is acquired as a position.
In some embodiments, the selecting module may be specifically configured to:
determining the position of a punctuation mark in text information, wherein the text information comprises at least one punctuation mark;
and selecting a target text from the text information according to the relative distance between the position of the sight focus and the position of the punctuation mark.
In some embodiments, the adjusting module may be specifically configured to:
displaying at least two different test texts on a display interface, wherein the test parameters of the test texts are different;
acquiring an input text selection instruction, and selecting a target test text from at least two different test texts;
and adjusting the parameters of the target text according to the test parameters of the target test text.
Wherein the test parameters include at least one of font size, font style, color, background color, and brightness.
In some embodiments, the adjusting module may be further configured to:
displaying a preset vision test chart on a display interface and outputting second prompt information;
determining an amplification factor according to the vision test result and the distance between the current face of the user and the display interface;
and amplifying the target text according to the amplification factor.
And the second prompt information is used for indicating the user to input a corresponding vision test result according to a preset vision test table.
In some embodiments, the text processing apparatus may further include a termination module, configured to restore a parameter of the target text to a preset initial value and/or stop playing the broadcasted target text in response to the received request message.
Wherein the request information includes at least one of voice information and an instruction input to the terminal device.
In some embodiments, the text processing apparatus may further include an update module configured to:
adjusting parameters of the target text and/or broadcasting the target text in voice every time, and storing the facial state of the target object to a target storage space;
counting the number of face states with the same state in a target storage space;
when the number of face states having the same state reaches a preset threshold value, the preset state is updated to the same face state.
The apparatus provided in the embodiment of the present application may be used to execute the method in the above embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the obtaining module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a processing element of the apparatus calls and executes the functions of the obtaining module. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
Fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 5, the terminal device 50 includes: at least one processor 51, memory 52, bus 53, and communication interface 54.
Wherein: the processor 51, the communication interface 54 and the memory 52 communicate with each other via the bus 53.
The communication interface 54 is used for communication with other devices. The communication interface 54 includes a communication interface for data transmission and a display interface or an operation interface for human-computer interaction.
The processor 51 is used for executing computer instructions, and particularly, relevant steps in the methods described in the above embodiments can be executed.
The processor 51 may be a central processing unit or one or more integrated circuits configured to implement embodiments of the present invention. The terminal device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
A memory 52 for storing computer instructions. The memory may comprise high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.
The present embodiment also provides a readable storage medium, in which computer instructions are stored, and when at least one processor of the terminal device executes the computer instructions, the terminal device executes the text processing method provided in the foregoing various embodiments.
The present embodiments also provide a program product comprising computer instructions stored in a readable storage medium. The computer instructions can be read by at least one processor of the terminal device from a readable storage medium, and the execution of the computer instructions by the at least one processor causes the terminal device to implement the text processing method provided by the various embodiments described above.
In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula, the character "/" indicates that the preceding and following related objects are in a relationship of "division". "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for convenience of description and distinction and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (11)

1. A text processing method is applied to terminal equipment, the terminal equipment comprises a display interface for displaying text information, and the method comprises the following steps:
acquiring a facial state of a target object, and determining whether the facial state is matched with at least one preset state, wherein the facial state comprises at least one of an eye muscle state, an eyelid state, an eyebrow state and an eye opening state;
when the face information is matched with at least one preset face information, acquiring a current face image of the target object, and determining key coordinate points and auxiliary coordinate points in the face image, wherein the key coordinate points comprise coordinate points of the nose tip and eyes of the target object, and the auxiliary coordinate points comprise coordinate points of the eyebrows, the mouth, the ears, the forehead and the chin of the target object;
determining a three-dimensional model of the face of the target object according to the key coordinate points and the auxiliary coordinate points;
determining the position of the sight focus of the target object in the display interface according to the three-dimensional model of the face, the distance between the current face of the target object and the display interface and the pupils of two eyes in the current face;
selecting a target text from the text information according to the position;
and adjusting the parameters of the target text and/or broadcasting the target text in a voice mode, wherein the parameters of the target text at least comprise at least one of the font size, font style, color, background color and brightness of the target text.
2. The method of claim 1, wherein determining the position of the gaze focus of the target object in the display interface based on the three-dimensional model of the face, the distance of the target object's current face from the display interface, and the pupils of both eyes in the current face comprises:
determining the projection position of the nose tip on a display interface according to the distance between the current face of the target object and the display interface;
determining a first projection position of a pupil in a left eye and a second projection position of a pupil in a right eye of the target object on the display interface according to the three-dimensional facial model and the projection position of the nose tip;
acquiring an intersection area of the first projection position and the second projection position as the position.
3. The method of claim 1, wherein selecting a target text from the text message according to the location comprises:
determining the position of a punctuation mark in the text information, wherein the text information comprises at least one punctuation mark;
and selecting a target text from the text information according to the relative distance between the position of the sight focus and the position of the punctuation mark.
4. The method of claim 1, wherein the adjusting the parameters of the target text comprises:
displaying at least two different test texts on the display interface, wherein the test parameters of each test text are different, and the test parameters comprise at least one of a font size, a font style, a color, a background color and a brightness;
acquiring an input text selection instruction, and selecting a target test text from the at least two different test texts;
and adjusting the parameters of the target text according to the test parameters of the target test text.
5. The method of claim 1, wherein the adjusting the parameters of the target text comprises:
displaying a preset vision test table on the display interface and outputting second prompt information, wherein the second prompt information is used for instructing a user to input a corresponding vision test result according to the preset vision test table;
determining a magnification coefficient according to the vision test result and the distance between the current face of the user and the display interface;
and amplifying the target text according to the amplification factor.
6. The method of claim 1, further comprising:
and responding to the received request information, and restoring the parameters of the target text to a preset initial value and/or stopping broadcasting the target text, wherein the request information comprises at least one of voice information and an instruction input to the terminal equipment.
7. The method of claim 1, further comprising:
adjusting parameters of the target text and/or broadcasting the target text in voice every time, and storing the facial state of the target object to a target storage space;
counting the number of face states with the same state in the target storage space;
when the number of face states with the same state reaches a preset threshold value, updating the preset state to the same face state.
8. A text processing apparatus, comprising:
the matching module is used for acquiring the facial state of the target object and determining whether the facial state is matched with at least one preset state, wherein the facial state comprises at least one of eye muscle state, eyelid state, eyebrow state and eye opening state;
the acquiring module is used for acquiring a current facial image of the target object when the facial information is matched with at least one preset facial information, and determining key coordinate points and auxiliary coordinate points in the facial image, wherein the key coordinate points comprise coordinate points of the nose tip and eyes of the target object, and the auxiliary coordinate points comprise coordinate points of the eyebrows, the mouths, the ears, the forehead and the chin of the target object;
the model determining module is used for determining a three-dimensional model of the face of the target object according to the key coordinate points and the auxiliary coordinate points;
the position determining module is used for determining the position of the sight focus of the target object in the display interface according to the three-dimensional model of the face, the distance between the current face of the target object and the display interface and the pupil angles of two eyes in the current face;
the selecting module is used for selecting a target text from the text information according to the position;
and the adjusting module is used for adjusting the parameters of the target text and/or broadcasting the target text in a voice mode, wherein the parameters of the target text at least comprise at least one of the font size, font style, color, background color and brightness of the target text.
9. A terminal device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the method of any of claims 1-7.
10. A readable storage medium having stored therein computer instructions, which when executed by a processor, are configured to implement the method of any one of claims 1-7.
11. A program product comprising computer instructions, characterized in that the computer instructions, when executed by a processor, implement the method of any of claims 1-7.
CN202111631947.9A 2021-12-28 2021-12-28 Text processing method, apparatus, device, medium, and program product Active CN114281236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111631947.9A CN114281236B (en) 2021-12-28 2021-12-28 Text processing method, apparatus, device, medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111631947.9A CN114281236B (en) 2021-12-28 2021-12-28 Text processing method, apparatus, device, medium, and program product

Publications (2)

Publication Number Publication Date
CN114281236A true CN114281236A (en) 2022-04-05
CN114281236B CN114281236B (en) 2023-08-15

Family

ID=80877428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111631947.9A Active CN114281236B (en) 2021-12-28 2021-12-28 Text processing method, apparatus, device, medium, and program product

Country Status (1)

Country Link
CN (1) CN114281236B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115857706A (en) * 2023-03-03 2023-03-28 浙江强脑科技有限公司 Character input method and device based on facial muscle state and terminal equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840509A (en) * 2010-04-30 2010-09-22 深圳华昌视数字移动电视有限公司 Measuring method for eye-observation visual angle and device thereof
CN103594003A (en) * 2013-11-13 2014-02-19 安徽三联交通应用技术股份有限公司 System and method for driver remote monitoring and driver abnormity early warning
CN109147017A (en) * 2018-08-28 2019-01-04 百度在线网络技术(北京)有限公司 Dynamic image generation method, device, equipment and storage medium
CN110569457A (en) * 2019-08-01 2019-12-13 平安科技(深圳)有限公司 page adjusting method and device, computer equipment and storage medium
JP2020024750A (en) * 2019-11-11 2020-02-13 株式会社Jvcケンウッド Line-of-sight detection device, line-of-sight detection method, and computer program
WO2020105349A1 (en) * 2018-11-20 2020-05-28 ソニー株式会社 Information processing device and information processing method
CN111601064A (en) * 2020-05-18 2020-08-28 维沃移动通信有限公司 Information interaction method and information interaction device
CN112799508A (en) * 2021-01-18 2021-05-14 Oppo广东移动通信有限公司 Display method and device, electronic equipment and storage medium
CN112836685A (en) * 2021-03-10 2021-05-25 北京七鑫易维信息技术有限公司 Reading assisting method, system and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840509A (en) * 2010-04-30 2010-09-22 深圳华昌视数字移动电视有限公司 Measuring method for eye-observation visual angle and device thereof
CN103594003A (en) * 2013-11-13 2014-02-19 安徽三联交通应用技术股份有限公司 System and method for driver remote monitoring and driver abnormity early warning
CN109147017A (en) * 2018-08-28 2019-01-04 百度在线网络技术(北京)有限公司 Dynamic image generation method, device, equipment and storage medium
WO2020105349A1 (en) * 2018-11-20 2020-05-28 ソニー株式会社 Information processing device and information processing method
CN110569457A (en) * 2019-08-01 2019-12-13 平安科技(深圳)有限公司 page adjusting method and device, computer equipment and storage medium
JP2020024750A (en) * 2019-11-11 2020-02-13 株式会社Jvcケンウッド Line-of-sight detection device, line-of-sight detection method, and computer program
CN111601064A (en) * 2020-05-18 2020-08-28 维沃移动通信有限公司 Information interaction method and information interaction device
CN112799508A (en) * 2021-01-18 2021-05-14 Oppo广东移动通信有限公司 Display method and device, electronic equipment and storage medium
CN112836685A (en) * 2021-03-10 2021-05-25 北京七鑫易维信息技术有限公司 Reading assisting method, system and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115857706A (en) * 2023-03-03 2023-03-28 浙江强脑科技有限公司 Character input method and device based on facial muscle state and terminal equipment

Also Published As

Publication number Publication date
CN114281236B (en) 2023-08-15

Similar Documents

Publication Publication Date Title
CN109086726B (en) Local image identification method and system based on AR intelligent glasses
US20200184975A1 (en) Word flow annotation
KR102576135B1 (en) Sensory eyewear
US10831268B1 (en) Systems and methods for using eye tracking to improve user interactions with objects in artificial reality
US9025016B2 (en) Systems and methods for audible facial recognition
EP4198814A1 (en) Gaze correction method and apparatus for image, electronic device, computer-readable storage medium, and computer program product
US20220076681A1 (en) Transcription summary presentation
US11354882B2 (en) Image alignment method and device therefor
JP6753331B2 (en) Information processing equipment, methods and information processing systems
US9028255B2 (en) Method and system for acquisition of literacy
CN112634459A (en) Resolving natural language ambiguities for simulated reality scenarios
CN114779922A (en) Control method for teaching apparatus, control apparatus, teaching system, and storage medium
CN114281236B (en) Text processing method, apparatus, device, medium, and program product
WO2017036516A1 (en) Externally wearable treatment device for medical application, voice-memory system, and voice-memory-method
JP2014236228A (en) Person information registration device and program
CN111723758B (en) Video information processing method and device, electronic equipment and storage medium
CN113139491A (en) Video conference control method, system, mobile terminal and storage medium
CN110491384B (en) Voice data processing method and device
CN115469748A (en) Interaction method, interaction device, head-mounted equipment and storage medium
CN111027358A (en) Dictation and reading method based on writing progress and electronic equipment
CN116088675A (en) Virtual image interaction method, related device, equipment, system and medium
CN115484411A (en) Shooting parameter adjusting method and device, electronic equipment and readable storage medium
CN114895790A (en) Man-machine interaction method and device, electronic equipment and storage medium
CN110969161B (en) Image processing method, circuit, vision-impaired assisting device, electronic device, and medium
DE102021123866A1 (en) AUDIO INTERFACE FOR PORTABLE DATA PROCESSING UNITS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant