Науковий часопис УДУ імені Михайла Драгоманова

48

DOI: https://doi.org/10.31392/UDU-nc.series9.2025.30.05
UDC: 81’253

Svitlana A. Matvieieva
Doctor of Sciences (Philology), Professor,

Department of Applied Philology
and Translation Studies,

Faculty of Foreign Philology,
Mykhailo Dragomanov State University of Ukraine,

Kyiv, Ukraine;
Faculty of Social Sciences, Arts and Humanities,

Kaunas University of Technology,
Kaunas, Lithuania

https://orcid.org/0000-0002-8357-9366
e-mail: s.a.matvyeyeva@udu.edu.ua

Ramunė Kasperė
Dr. (Philology), Professor,

Faculty of Social Sciences, Arts and Humanities,
Kaunas University of Technology,

Kaunas, Lithuania
https://orcid.org/0000-0003-0782-3758

e-mail: ramune.kaspere@ktu.lt

EEYYEE--TTRRAACCKKIINNGG EEXXPPEERRIIMMEENNTTAALL DDEESSIIGGNN
FFOORR IINNVVEESSTTIIGGAATTIINNGG VVIISSUUAALL AATTTTEENNTTIIOONN

IINN SSIIGGHHTT TTRRAANNSSLLAATTIIOONN

Bibliographic Description:
Matvieieva, S., Kasperė, R. (2025). Eye-Tracking Experimental Design for

Investigating Visual Attention in Sight Translation. Scientific Journal of Mykhailo
Dragomanov State University of Ukraine. Series 9. Current Trends in Language Development,
30, 48–62. https://doi.org/10.31392/UDUnc.series9.2025.30.05

A b s t r a c t
Understanding the cognitive processes behind sight translation is a significant challenge in translation

studies. This paper addresses that gap by proposing a detailed experimental design for eye-tracking studies.
Our goal is to systematically investigate the role of visual attention during real-time sight translation
performance.

Using eye-tracking as a powerful research tool, our design provides an opportunity to analyse the
complex cognitive mechanisms at play. The proposed methodology offers a framework to explore key aspects,
such as how an interpreter’s gaze behaviour serves as an indicator of their cognitive load and translation
speed. The study also incorporates Natural Language Processing (NLP) to analyse gaze-to-speech alignment
and pinpoint processing bottlenecks.


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

49

This experimental framework is crucial for several reasons. It will help us refine cognitive models of
translation using objective indicators of mental activity, identify distinct attention patterns in experienced
versus novice interpreters, and ultimately, inform the development of more human-centred computer-assisted
translation (CAT) systems. Our research promises to provide a more robust understanding of the link between
an interpreter’s visual behaviour and their cognitive processes, paving the way for advancements in both
translation theory and practice.

Keywords: sight translation, eye-tracking, cognitive processes, visual attention, natural language
processing (NLP), cognitive models, computer-assisted translation (CAT).

1. Introduction.
At the core of translation activity lie complex cognitive processes related to the

perception, processing, interpretation, and transformation of texts. Research into these
processes has traditionally relied on the analysis of translation products, verbal protocols, or
expert evaluations. However, such approaches do not allow for the direct observation of the
visual attention mechanisms that underpin real-time translation decision-making. Eye-
tracking opens up new perspectives for observing the translation process “from within”.

Although eye-tracking methods have seen substantial development in psychology,
education, and linguistics, these tools have yet to be properly integrated into the theoretical
discourse of translation studies. The absence of a conceptual framework, weak links to
cognitive models of translation, and insufficient attention to the practical potential of these
technologies hinder their broader implementation.

This paper seeks to address a fundamental question: How do an interpreter’s visual
attention patterns, as captured by eye-tracking technology, correlate with the cognitive
processes involved in real-time sight translation, and how do these patterns differ between
interpreters with varying levels of expertise?

This paper contributes to the field of Translation Studies by addressing a theoretical
gap regarding the insufficiently conceptualized role of visual attention in existing cognitive
models of sight translation, specifically by demonstrating how eye-tracking data, collected
through the proposed experimental design, can inform and refine theoretical models of this
complex process. The novelty of the proposed conceptual model lies in its integration of
specific insights from eye-tracking research into existing cognitive load and effort models,
thereby offering a more granular and empirically grounded perspective on the simultaneous
processing of visual, semantic, and speech channels. By focusing on the interplay between
short-term memory and visual control, the paper offers a novel hypothesis, namely that the
explicit analysis of visual attention patterns, as enabled by this experimental approach, can
unveil previously unexamined strategies and challenges that interpreters encounter, thus
enhancing the predictive and explanatory power of sight translation theories.

Sight translation is a complex cognitive activity that requires the simultaneous
integration of multiple modalities and linguistic levels. A comprehensive understanding of
this process necessitates a deeper look into its core components:

- perceptual modalities: this is the initial stage where the interpreter’s visual system
processes the source text. It involves visual attention and gaze patterns (e.g., fixations,
saccades) as the eyes move across the text to identify and segment information. The
efficiency of this process is crucial, as it directly impacts the speed and accuracy of
subsequent stages. Our study specifically focuses on these visual patterns as a window into
the interpreter’s real-time cognitive activity;

- representational modalities: once the visual information is perceived, it is transformed
into a mental representation. This stage involves semantic and conceptual processing where
the interpreter extracts meaning from the source text, rather than just words in isolation. This
goes beyond simple decoding and includes comprehending the author’s intent, the text’s


Науковий часопис УДУ імені Михайла Драгоманова

50

context, and its overall coherence. This is a cognitive, non-observable process, and we
maintain that prolonged fixations and regressions are empirical indicators of the cognitive
effort expended at this stage;

- speech channel and linguistic levels: as the interpreter processes the source text, they
simultaneously generate the target language output. This involves integrating the perceptual
and representational stages with the speech channel. The interpreter must manage the
syntactic analysis and restructuring of the target language to produce a fluent and
grammatically correct translation. This is where we analyse the final linguistic product for
signs of cognitive load, such as disfluencies, pauses, and self-corrections. By aligning eye-
tracking data with these verbal output metrics, we can pinpoint precisely when and where the
cognitive challenges occur during the translation process.

By explicitly elaborating on these aspects, our study provides a more nuanced
framework for understanding sight translation, showing that it is not a monolithic process but
a dynamic interaction between distinct yet interconnected cognitive functions.

To implement the stated idea, it is worth adopting an interdisciplinary approach and
considering the close integration of the following scientific fields:

1. Translation Studies provides the theoretical foundation for analyzing the translation
process as a cognitive activity conducted within temporal and resource constraints. It offers a
conceptual framework for understanding translation as a multi-component act that combines
source text analysis, meaning interpretation, and target text production. Eye-tracking can
deepen insights into the real-time strategies employed by interpreters and enable empirical
testing of hypotheses regarding strategy use by both experienced and novice interpreters.

2. Cognitive science explores the mechanisms of attention, memory, decision-making,
and information processing. Its methodologies facilitate the analysis of cognitive load and
the stages of translational thought. Applying its theories allows eye-tracking data to be
interpreted both as reflections of mental processes (attention, memory, decision-making,
processing of linguistic structures) and as indicators of cognitive load and the effectiveness
of translation performance.

3. Engineering brings attention to the development of efficient and user-friendly
interfaces for interpreters. It focuses on constructing hardware and software solutions for the
collection, storage, and processing of multimodal data (gaze, speech, response time, etc.),
and it supports the creation of interfaces for training, quality control, and interactive
translation assistance. Eye-tracking data can inform the adaptive design of computer-assisted
translation (CAT) systems that adjust to user behaviour.

4. NLP plays a crucial role in structuring linguistic data, identifying patterns in
translation equivalents, evaluating linguistic complexity, and automatically analysing text
corpora. It provides tools for modelling both written and oral translation processes. While not
central to the present study, NLP is a closely related field offering technological solutions for
text analysis. When combined with eye-tracking metrics, such solutions could become the
foundation for hybrid translation research in the future.

This eye-tracking experiment is designed to explore the cognitive mechanisms
underlying sight translation by integrating eye-tracking technology with audio analysis and
subjective self-assessment. By collecting synchronized gaze and speech data from both
professional interpreters and advanced translation students, the study aims to investigate how
visual attention is distributed during real-time oral translation of different text types. The
experimental design allows for a multifaceted analysis of the translation process, focusing on
indicators such as fixation patterns, regressions, speech disfluencies, and cognitive load.
Through this approach, the study seeks to identify how interpreters manage information
processing, handle linguistic challenges, and adapt strategies based on task complexity,


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

51

offering valuable insights into the interplay between visual attention and verbal output in
sight translation.

2. Literature Review.
In recent decades, the use of eye-tracking – a technology that records eye movements

and thus reveals attentional focus – has become increasingly prevalent in linguistics and
cognitive science. Eye-tracking studies are widely used in the analysis of reading, visual
information processing, language, and learning/teaching. In the field of Translation Studies,
eye-tracking as a research methodology has been more extensively employed since mid-
2000s (O’Brien, 2007; Jakobsen and Jensen, 2008; Ehrensberger-Dow and Perrin, 2009; Carl
et al., 2011; Shreve et al., 2011; Nitzke, 2016; Daems et al., 2017; Moorkens, 2018;
Kornacki, 2019; Kasperavičienė et al., 2020). However, the technology and methodology
remain insufficiently integrated into theoretical translation frameworks.

Translation is a complex cognitive activity that integrates perception, analysis,
interpretation, and speech production in real time. Traditionally, research on translation
processes has relied on analysing translation outputs, verbal protocols, or retrospective
surveys. However, these approaches limit the exploration of the dynamic aspects of
translational attention, which plays a key role in decision-making and the construction of
translation strategies (Munday, 2012).

The earliest studies in the field of translation using eye-tracking have yielded
significant insights into various cognitive and behavioural aspects of the translation process.
These include:

- cognitive load and directionality in translation (Pavlović and Jensen, 2009),
highlighting how the direction of translation (L1→L2 vs. L2→L1) reflects processing
demands;

- the relationship between sentence readability and cognitive load in interpreting
trainees (Chmiel & Mazur, 2010), demonstrating how syntactic complexity correlates with
processing strain;

- interpreter profiles based on process data from keylogging and eye-tracking, as well
as features shared by both student and professional interpreters (Dragsted & Carl, 2013),
offering a typology of translation behaviours;

- the potential of eye-tracking for interpreter education (Kornacki, 2019), showing its
usefulness as a feedback tool in training contexts;

- differences in gaze behaviour between students and professional interpreters, and
attempts to define and measure revision competence when editing human pre-translated texts
(Schaeffer et al., 2019);

- gaze behaviour and processing styles in interpreters, including the identification of
different processing strategies (Su, 2020).

-  uncertainty management in sight translation in professional and novice interpreters
(He & Wang, 2021).

These and other studies underscore the richness of eye-tracking data in uncovering the
nuances of translation cognition and interpreter behaviour across levels of expertise and task
types.

Specific data have been obtained regarding eye movement patterns during translation
between languages of the same family, different families, and different groups within the
same family. For instance, a comparative experiment involving German-Polish and English-
Polish language pairs demonstrated that the verb-final position and complex noun phrases in
German imposed higher cognitive demands for sight translation into Polish than similar tasks
from English into Polish (Korpal, 2012). Observations of professional interpreters with


Науковий часопис УДУ імені Михайла Драгоманова

52

Danish as their L1 and English as their L2 showed a tendency to downplay elaborate or
creative metaphorical imagery in the source text, rendering it with more conventional
metaphors in the target text (Sjørup, 2013). In English-Chinese sight translation, two
Chinese-specific problem triggers were identified: the back-sloping comma and head-final
noun phrases (Su and Li, 2019). Moreover, syntactic complexity was shown to significantly
increase cognitive load during sight translation tasks (Ma, 2021). These findings highlight
how language typology, syntactic structure, and metaphorical expression can shape cognitive
load and inform translation strategies, as revealed through eye-tracking data.

Recent technological and AI developments are turning our attention to the increasing
integration of cognitive research with technological progress. Central themes include
interactions with translation tools, machine translation, and human-computer interaction,
underscoring the centrality of cognitive research in shaping technology-mediated translation
workflows (Li & Zhong, 2024).

Recently, researchers have proposed several approaches to conducting eye-tracking
experiments focused on both the process and product of translation. These approaches aim to
capture not only real-time cognitive activity during translation but also how gaze behaviour
correlates with translation quality, strategy use, and decision-making.

An experiment involving sight translation by trainee interpreters (Chmiel & Mazur,
2010) revealed that overall task duration did not significantly differ based on experience
level. Instead, text readability – rather than syntactic structure – was a more reliable predictor
of processing load, with simpler sentences resulting in fewer and shorter eye fixations.

An experiment investigating word order asymmetry in Chinese-English sight
translation (Ma et al., 2022) focused on the structural differences between the source and
target languages. Trainee interpreters were tasked with translating both isolated and
contextually embedded sentences, while eye-tracking metrics such as rereading rate and
reading-ahead frequency were analysed. The findings revealed that structurally asymmetric
sentences led to significantly more regressions, whereas contextual embedding had little
effect on reducing processing difficulty.

Another eye-voice experiment span in Chinese-English sight interpreting (Su, 2023)
examined the temporal lag between gaze and speech by comparing novice and professional
interpreters. The results showed that longer eye-voice spans were associated with higher
rates of errors and disfluencies, particularly among less experienced participants.

A longitudinal experiment on sight translation (Fang et al., 2023), in which students
translated texts over the course of two semesters while their eye movements were tracked,
found that although translation quality improved significantly, overall reading behaviour
remained largely unchanged. Notably, training effects were more evident among participants
with lower initial skill levels.

While our study is guided by the foundational principles of Gile’s Effort Model (Gile,
2021), our experimental design and analysis framework are also grounded in a broader range
of cognitive translation theories. We integrate insights from cognitive processing models,
such as those proposed by Bell (1991) and Lörscher (1991), which conceptualize translation
as a series of cognitive problem-solving stages, including analysis, transfer, and
restructuring. Our eye-tracking methodology, which captures real-time gaze patterns,
provides a tool to empirically observe these stages and the shifts in attention between them.

Furthermore, we draw on neurolinguistic models of translation, which suggest that the
process involves multiple brain regions and is influenced by both linguistic and non-
linguistic factors. The integration of eye-tracking data with linguistic analysis of the verbal
output (e.g., disfluencies, pauses) allows us to connect the visual cognitive effort with


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

53

observable signs of processing difficulty, thereby providing empirical support for these
neurolinguistic hypotheses.

By situating our research within this wider theoretical landscape, we aim to provide a
more nuanced understanding of sight translation. Our approach moves beyond a single-
model perspective to investigate how different cognitive components – such as attention,
memory, and problem-solving – interact dynamically under the pressure of real-time
performance. This multidisciplinary grounding strengthens the motivation of our study and
enhances its potential to contribute to a more comprehensive theory of cognitive translation.

Undoubtedly, these approaches demonstrate the growing methodological sophistication
in translation research, allowing for a more nuanced understanding of how visual attention
shapes and reflects the interpreter’s cognitive experience.

The proposed experiment, while building on a solid foundation of eye-tracking
research in sight translation, makes several unique contributions that go beyond existing
methodologies:

- multimodal data integration: unlike prior studies that often focus on a single type of
data, our design integrates synchronized gaze data (fixations, regressions, saccades), verbal
output (speech rate, disfluencies), and subjective feedback (cognitive load reports) to provide
a holistic view of the translation process. This approach allows for a more comprehensive
analysis of how these cognitive channels interact in real time;

- advanced NLP-based analysis: we introduce the innovative use of NLP tools to
automatically transcribe and tag verbal output for pauses and disfluencies, which are then
precisely aligned with the eye-tracking data. This goes beyond manual analysis by providing
a more objective and efficient way to pinpoint moments of cognitive tension and processing
bottlenecks;

- comparative scope and rigor: the design compares both experienced and novice
interpreters and uses multiple, pre-analysed text genres (fiction, non-fiction, scientific) to
ensure a more robust and generalizable analysis. This allows us to investigate how expertise
and text type interact to influence visual attention strategies, an area that has been less
explored in previous research;

- focus on cognitive profiling: the methodology is specifically designed to allow for the
creation of distinct cognitive profiles (e.g., “chunk-based” versus “linear” readers) by
analysing the relationship between fixation spans and verbalization patterns. This provides a
concrete, empirically-grounded way to categorize interpreter strategies, which has significant
implications for both translation theory and training.

By incorporating these methodological advancements, our research aims to provide not
just a richer understanding but a more granular, empirically validated insight into the
interplay between an interpreter’s visual behaviour and their cognitive processes.

3. Aim and Objectives.
The aim of the study is to propose and detail an experimental design for investigating

interpreters’ visual attention using eye-tracking, thereby assessing the applicability of the
methodology for studying the translation process and the development of supportive
technologies.

The research tasks are:
1) to conduct an interdisciplinary review of academic sources on the application of eye-

tracking in translation studies;
2) to justify the prospects of using eye-tracking to study cognitive strategies in sight

translation;


Науковий часопис УДУ імені Михайла Драгоманова

54

3) to formulate testable hypotheses regarding the potential use of eye-tracking data for
revealing the cognitive mechanisms involved in sight translation;

4) to present a comprehensive experimental design;
5) to outline directions for future empirical research.
This paper proposes a series of testable hypotheses designed to bridge the gap between

theoretical models of translation processes and empirical data on real-time cognitive
behaviour in sight translation. These hypotheses underpin the proposed experimental design,
focusing on the capacity of eye-tracking technology to illuminate the intricate cognitive
mechanisms involved in this demanding task.

Drawing upon established cognitive models of interpreting, particularly Gile’s Effort
Model, which posits the simultaneous processing of visual, semantic, and speech channels
under significant cognitive effort, and framed within Cognitive Load Theory and the Effort
Models (Gile, 2021), we hypothesize the following:

H1: Eye-tracking metrics will reveal distinct processing strategies in sight translation,
specifically differentiating between the visual attention patterns of experienced and novice
interpreters.

Sub-hypotheses: More experienced interpreters will exhibit greater anticipatory visual
scanning of upcoming sentence segments, broader fixation spans (indicative in chunk-based
processing), and fewer regressions, reflecting more automatized cognitive routines and
reduced cognitive load compared with novices.

H2: Observable gaze-to-speech synchronization patterns, quantifiable through eye-
tracking data aligned with verbal output (potentially using NLP tools), will serve as direct
indicators of cognitive processing efficiency and moments of heightened cognitive load.

Sub-hypothesis: Increased fixation durations and delays before verbalization will
correlate with higher lexical or syntactic complexity of the source text segments.

H3: Specific visual attention patterns (e.g., increased regressions, prolonged fixations
on problematic areas) will correspond to identifiable problem-solving strategies and
decision-making complexities during sight translation, observable in verbal output
(e.g., pauses, self-corrections, hedging).

These hypotheses, if supported by empirical data collected through the proposed
experimental design, will enhance the predictive and explanatory power of sight translation
theories by providing objective, real-time insights into attention distribution, cognitive load
management, short-term memory utilization, and visual control strategies employed by
interpreters.

4. Methodology.
4.1. Data Collection Methodology for Sight Translation Study Using Eye-Tracking.
Participants
- number: a projected sample of approximately 15 participants with varying experience

levels (translation program students and professional interpreters) to ensure sufficient power
for qualitative analysis and exploration of individual patterns, although a larger sample size
will be needed for statistical generalizability in future studies;

- selection criteria:
⹀ native speakers of the target language;
⹀ high proficiency in the source language;
⹀ prior experience in sight translation is preferred but not mandatory;

- questionnaire: preliminary survey to assess experience level and self-reported
translation strategies.

Equipment and Tools


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

55

- eye tracker: e.g., Tobii Pro Nano / Tobii Pro Fusion (binocular tracking, 60–120 Hz);
- microphone / audio recording: for capturing oral translation;
- computer screen: standard size (17–22 inches);
- software:

⹀ Tobii Pro Lab or equivalent (for gaze data recording);
⹀ Audacity or similar (for audio processing);
⹀ ELAN / Praat (for speech annotation and analysis);
⹀ Python / spaCy / Whisper / ASR API (for automatic transcription and NLP

analysis).
Materials / Stimuli – written texts for sight translation:
- 3 text types (100–150 words each): fiction, non-fiction, and scientific;
- standardized through pre-analysis using syntactic and lexical complexity metrics to

ensure cross-text comparability;
- translated orally on sight, with no prior exposure to the text.
All texts should be pre-analysed to assess their linguistic complexity and ensure

comparability across types. Specifically, for evaluating syntactic complexity, measures such
as the Flesch-Kincaid Readability Test, L2 Readability Index, Mean Sentence Length, and
Noun Phrase Density may be used. Lexical sophistication can be assessed by Lexical
Diversity (Type-Token Ratio) and Lexical Frequency (based on a corpus). Other linguistic
characteristics should also be considered, for instance, the presence of abbreviations,
terminology, complex sentences, passive constructions, etc. The purpose of such an analysis
is not only to categorize texts by their type but also to ensure their comparability based on
objective linguistic indicators, which will allow for a more precise interpretation of the
influence of text characteristics on interpreters’ cognitive load and gaze patterns.

Data Collected
- gaze data: fixation durations, number of regressions, saccade amplitude, areas of

interest (AOIs), reading tempo;
- audio: speech rate, pauses, false starts, self-corrections;
- subjective data: post-task questionnaire (e.g., NASA-TLX for cognitive load

assessment).
4.2. Experimental Procedure.
Stage 1: Briefing and Calibration
- procedure explained to participants;
- eye-tracker calibration;
- interface familiarization (without previewing texts).
Stage 2: Task Execution
Each participant:
- receives one text at a time on screen;
- translates aloud in real time, without pausing or stopping;
- duration: ~3–5 minutes per text;
- total task time: ~9–15 minutes per participant;
Data recorded:

⹀ gaze data (which words they looked at, and for how long);
⹀ audio of the translation.

Stage 3: Completion and Survey
- participants complete a short questionnaire about perceived difficulty, cognitive load,

and satisfaction;
- optional: brief semi-structured interview.


Науковий часопис УДУ імені Михайла Драгоманова

56

4.3. Preliminary Data Analysis.
Gaze-to-Speech Alignment
- Does gaze fixation precede speech onset? (speech delay after fixation)
- AOI size and distribution: is the interpreter focusing on key words?
- Regressions: does the interpreter return to the beginning of the sentence?
To validate our hypotheses, the experimental design will not only focus on individual

variables but will also employ a multifaceted analysis that examines the intricate interplay
between them. Specifically, our analysis will explore the relationship between the following
key factors:

- interpreter experience and text genre: we will compare how fixation patterns and
cognitive load metrics (e.g., prolonged fixations, regressions) differ for experienced versus
novice interpreters when presented with varied text genres (fiction, non-fiction, scientific).
This analysis aims to reveal if experienced interpreters maintain a consistent processing
strategy across different text types, or if their adaptive strategies are genre-specific. For
novices, we will investigate whether certain genres (e.g., scientific texts with high lexical
density) impose a disproportionately higher cognitive load;

- linguistic complexity and verbal output: we will correlate the pre-analysed linguistic
complexity of text segments (e.g., noun phrase density, passive constructions) with the
participants’ verbal output. We hypothesize that increased syntactic complexity will
correspond with higher rates of disfluencies, pauses, and self-corrections, providing
empirical support for the hypothesis that specific visual patterns correspond to problem-
solving strategies;

- gaze-to-speech alignment and experience level: our analysis will investigate whether
experienced interpreters demonstrate a longer eye-voice span and a more stable gaze-to-
speech synchronization than novices, particularly across different genres. This will help us
determine if a more automatized cognitive routine, which we hypothesize is a characteristic
of experienced interpreters, is reflected in their ability to maintain a consistent temporal lag
between visual input and verbal output, regardless of the text type.

By performing these specific cross-analyses, we will move beyond a simple
comparison of groups or text types. This approach will allow us to provide a richer, more
comprehensive understanding of how attention, processing strategies, and cognitive load are
dynamically managed in sight translation, thereby offering stronger empirical validation for
our hypotheses.

Speech Analysis
- NLP tools: automatic transcription and tagging of pauses, corrections, redundant

elements;
- correlation with fixations/regressions (e.g., extended pauses preceding lexically or

syntactically complex items).
The verbal output of each participant will be transcribed and further analysed for

various linguistic and paralinguistic features relevant to cognitive processing. Crucially,
pauses and disfluencies (e.g., fillers, repetitions, self-corrections) will be tagged
automatically via a pipeline involving a speech-to-text tool (e.g., Whisper) and a linguistic
processing library (e.g., spaCy). This automated tagging will then be aligned with eye-
tracking data and AOIs using a specialized multimodal annotation software like ELAN for
comprehensive multimodal correlation. This integration will enable us to precisely analyse
the timing of cognitive processing (indicated by gaze) relative to verbal output, identify
moments of cognitive tension, and detect problem-solving strategies, thereby providing
empirical insights into the hypothesized role of NLP in analysing interpreter performance.


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

57

Cognitive Load
- match between fixation length and sentence complexity;
- compare subjective reports (NASA-TLX) with objective measures (long fixations +

frequent regressions).
The primary motivation of this paper is to propose a comprehensive methodological

framework for eye-tracking studies in sight translation, rather than to report the final,
generalizable findings of a full-scale study. While we acknowledge the limitations of our
preliminary sample – 15 participants and a limited set of three texts – this pilot study serves
as a crucial proof of concept to demonstrate the viability and analytical potential of the
proposed framework. The results from this initial phase are not intended for broad
generalization but are used to refine our methodology and confirm that the tools and
procedures are effective.

For a successful, full-scale realization of this framework, a significantly larger and
more balanced experimental design is required. Based on established practices in cognitive
and psycholinguistic research, we recommend the following:

- participants: a minimum of 30 to 40 participants per group (e.g., experienced and
novice interpreters) to ensure sufficient statistical power and the ability to draw meaningful,
generalizable conclusions about the effects of expertise;

- stimuli: a larger corpus of at least 10–15 texts per genre (fiction, non-fiction,
scientific) to mitigate the effects of individual text characteristics and to enable a robust,
multifaceted analysis of how different text types influence cognitive processing.

By clearly distinguishing between the pilot study as a test of the framework and the
framework itself as the main contribution, this paper provides a valuable blueprint for future,
large-scale research in this domain.

5. Results and Discussion.
The proposed experimental design aims to yield empirical insights into the intricate

interplay between visual attention and cognitive processes during sight translation, directly
addressing the key questions outlined in our introduction regarding the impact of gaze on
translation performance, the potential for interpreter cognitive profiling, and the supportive
role of NLP. Observations, which can be derived from the analysis of eye-tracking data
collected during sight translation tasks, provide a more refined understanding of the mental
operations underlying this demanding form of interlingual transfer.

Specifically, the analysis can reveal how gaze patterns influence both the pace and
quality of sight translation. It is hypothesized that this study is expected to reveal consistent
fixation delays before verbalization across participants, suggesting a crucial preparatory
cognitive phase where visual input is consolidated before oral delivery. Furthermore,
anticipatory visual scanning of upcoming sentence segments can be studied as a feature
typical among experienced interpreters, directly correlating with smoother verbal output and
fewer disfluencies. Conversely, frequent regressions can be identified as reliable indicators
of processing difficulty, often preceding self-corrections or prolonged pauses, thereby
confirming their utility as markers of cognitive effort or comprehension challenges. These
empirical outcomes will provide robust empirical validation for the hypothesis that real-time
observation of visual attention can ensure objective, real-time indicators of underlying
cognitive activity.

These findings will allow for the construction of more nuanced cognitive profiles of
interpreters with distinct processing profiles, particularly among those who adopt a “chunk-
based” reading strategy, processing larger segments of text before verbalizing, and “linear”
readers, whose gaze aligns more closely with their concurrent speech. For instance, more


Науковий часопис УДУ імені Михайла Драгоманова

58

experienced interpreters are presumed to exhibit broader fixation spans and more
anticipatory saccades, indicative of a more developed chunking strategy. This differentiation
has profound implications for interpreter training: by monitoring gaze flow, educators could
identify early on whether trainees are developing automatized processing or experiencing
cognitive overload, allowing for tailored interventions. For example, excessive regressions or
a purely linear reading pattern might indicate a need for targeted training in text
comprehension or anticipatory processing, as for example, regressions have been found more
frequent in novices than in professional experienced interpreters (He and Wang, 2021).

Furthermore, the integration of NLP will be invaluable in supporting our analysis,
particularly in enabling precise gaze-to-speech alignment. This technical synergy will allow
for an unprecedented analysis of the exact timing of cognitive processing (as indicated by
visual attention) relative to verbal output, providing empirical validation for phases within
sight translation models. NLP’s capacity to automatically detect pauses, self-corrections, and
hedging in the verbal output will serve as robust markers of cognitive tension and decision-
making complexity, offering quantifiable evidence that complements and enriches the eye-
tracking data. This highlights the transformative potential of combining visual and linguistic
data streams for a holistic understanding of interpreter cognition.

The empirical exploration of the interaction between visual attention and translation
processes is a crucial step towards a deeper understanding of the mental operations
underlying translation. By analysing visual strategies through objective eye-tracking data, we
will be able to:

- refine existing cognitive models of translation, offering empirically grounded insights
into the dynamic allocation of attention and resources during sight translation;

- identify distinct attention patterns characteristic of both experienced and novice
interpreters, providing a clearer roadmap for pedagogical interventions in interpreter
education;

- formulate robust hypotheses regarding the specific influence of text type, language
pair, or format (e.g., source text modality) on cognitive load, paving the way for targeted
future research;

- substantiate concrete directions for improving the interfaces of CAT systems. The
findings may suggest that adaptive CAT tools could integrate real-time gaze data to
anticipate user needs, highlight potential difficulties, or provide just-in-time assistance,
thereby better aligning with actual user cognitive strategies and fostering more human-
centred design.

The significance of this approach thus extends beyond merely gaining a deeper
understanding of translational thinking. It encompasses a substantial practical potential for
applying these findings in the development of truly adaptive CAT systems – interpreter
support tools that can be dynamically responsive to users’ cognitive patterns. Consequently,
eye-tracking emerges not merely as an abstract research tool but as a foundational element
for an applied shift towards more efficient and ergonomically optimized human-computer
interaction in translation.

Within the scope of this study, we propose a combination of related fields – cognitive
translation studies, psychology, engineering (specifically human-computer interaction), and,
to a significant extent, NLP. This interdisciplinary lens is instrumental in identifying areas of
intersection, relevant methodologies, and the unresolved questions that further empirical
work aims to address. Based on these and forthcoming findings, we will be able to offer
concrete empirically-driven hypotheses regarding the precise contribution of eye-tracking to
future theoretical advancements in the field of translation studies and its tangible impact on
the development of next-generation, human-centric translation technologies.


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

59

6. Conclusions.
The experimental design presented in this paper underscores how eye-tracking provides

a powerful tool for uncovering hidden cognitive processes during translation, offering
empirical insights into how interpreters allocate attention, manage processing effort, and
handle ambiguity or syntactic complexity. Its application in translation studies allows
researchers to move beyond product-oriented analyses and toward a more nuanced
understanding of the translation process itself. With growing interdisciplinary integration, the
methodology outlined herein positions eye-tracking to contribute significantly to the
modelling of interpreter expertise, comparisons across modalities (written, sight, and
simultaneous translation), and even the evaluation of the impact of technological tools such
as machine translation or post-editing environments on human cognition. It is anticipated that
the results of future empirical developments stemming from this design could be applied in
interpreter training through the integration of eye-tracking technologies into training
environments equipped with gaze-based feedback systems. Such applications may entail the
design of adaptive interfaces that prevent cognitive overload, the development of adaptive
language support systems (e.g., prompts, text simplification), and enhancements to CAT
tools and related technologies.

This research, by focusing on visual attention patterns through eye-tracking technology
and proposing a robust experimental framework, opens important avenues for advancing our
understanding the cognitive mechanisms underlying sight translation. It highlights the need
to conceptualize sight translation not merely as a linguistic activity, but as a complex
multimodal cognitive process shaped by real-time visual input, working memory, and
processing constraints. The theoretical perspective combined with the detailed experimental
design serve as a solid foundation for future empirical studies that can explore how different
linguistic structures, layout formats, or text types correlate with eye movement behaviour,
potentially leading to more effective training protocols and performance assessments for
interpreters.

7. Limitations and Further Work.
While this paper presents a comprehensive experimental design for investigating visual

attention in sight translation, it is important to acknowledge certain inherent limitations that
may impact the generalizability and scope of future empirical findings. Firstly, the proposed
study, like many initial eye-tracking investigations in controlled environments, may involve
a relatively small sample size (e.g., 10–15 participants). While sufficient for in-depth
qualitative analysis and hypothesis generation, a smaller sample may limit the statistical
generalizability of quantitative findings across broader populations of interpreters. Future
research should aim for larger, more diverse participant groups.

Secondly, despite advancements in multimodal data alignment, the process of gaze-to-
speech alignment inherently carries a potential for error. While sophisticated tools like
ELAN and NLP-based tagging (e.g., via spaCy + Whisper output) are proposed to minimize
discrepancies, the perfect synchronization of highly dynamic processes remains a
methodological challenge that necessitates careful interpretation of correlation data.
Researchers must remain vigilant regarding potential offsets and biases in this complex
alignment.

Furthermore, while the experimental design proposes pre-analysis of texts for linguistic
complexity across different types (fiction, non-fiction, scientific), inherent stylistic and
pragmatic differences between genres may still correlate with cognitive processing in ways
not fully captured by readability indices alone. Future work could explore more nuanced
qualitative analyses of the texts, or incorporate participants’ subjective perceptions of text


Науковий часопис УДУ імені Михайла Драгоманова

60

difficulty to complement objective measures. The current design also anticipates mixing
students and professional interpreters; while this allows for valuable comparative insights
into expertise development, it introduces variability that might complicate direct
performance comparisons or require careful subgroup analyses. Future studies could focus
exclusively on one group or employ a more controlled longitudinal design to track
development.

Building upon these limitations, several avenues for further work emerge. Longitudinal
studies tracking the development of visual strategies in trainee interpreters would offer
invaluable insights into expertise acquisition. Comparative research across different language
pairs and specific text features (e.g., idiomatic expressions, syntactic ambiguities) could
refine our understanding of how linguistic properties correlate with gaze patterns. Moreover,
developing real-time gaze-based feedback systems for interpreter training, as well as refining
adaptive CAT tools based on these cognitive insights, represents a significant and practical
direction for applied research. Finally, future research should incorporate cross-linguistic
comparisons and explore whether findings generalize across typologically diverse language
pairs.

8. Ethical Considerations.
The study design and all procedures involving human participants should be rigorously

reviewed and approved by the appropriate ethics committee or institutional review board.
Prior to their participation, all individuals will receive comprehensive information

regarding the study’s purpose, procedures, data collection methods (including eye-tracking,
video, and audio recording), the anticipated duration of their involvement, and the nature of
the data to be collected. All participants will provide informed consent by signing a written
consent form, confirming their voluntary participation and understanding of the study’s
terms. They will be explicitly informed of their right to withdraw from the study at any point
without penalty.

Measures will be taken to ensure the anonymity and confidentiality of the collected
data. Participant data will be anonymized or pseudonymized where applicable, and all
recordings and personal information will be stored securely on password-protected devices
accessible only to the research team, in accordance with applicable data protection
regulations. Only aggregated and anonymized data will be used for analysis and
dissemination.

R e f e r e n c e s

Bell, R.T. (1991). Translation and Translating: Theory and Practice. London: Longman.
Chmiel, A., & Mazur, I. (2010). Eye tracking sight translation performed by trainee interpreters. Tracks

and treks in translation studies, 189–205. John Benjamins Publishing Company. doi: https://doi.org
/10.1075/btl.108.10chm

Carl, M., Dragsted, B., Elming, J., Hardt, D., & Lykke Jakobsen, A. (2011). The process of post-editing: a
pilot study. In: Proceedings of the 8th International Natural Language Processing and Cognitive Science
Workshop, 131–142.

Daems, J., Vandepitte, S., Hartsuiker, R. J., Macken, L. (2017). Identifying the machine translation Error
types with the greatest impact on post-editing effort. In: Frontiers in Psychology, 8, 1–15.

Dragsted, B., & Carl, M. (2013). Towards a classification of translator profiles based on eye-tracking and
keylogging data. Journal of Writing Research, 5(1), 133–158. doi: https://doi.org/10.17239/jowr-2013.05.01.6

Ehrensberger, M., & Perrin, D. (2009). Capturing translation processes to access metalinguistic awareness.
Across Languages and Cultures, 10, 275–288. doi: https://doi.org/10.1556/Acr.10.2009.2.6

Fang, J., Zhang, X., & Kotze, H. (2023). The effects of training on reading behaviour and performance in
sight translation: a longitudinal study using eye-tracking. Studies in Translation Theory and Practice, 4, 655–
671. doi: https://doi.org/10.1080/0907676X.2022.2030372


ВИПУСК 30’2025 Серія 9. Сучасні тенденції розвитку мов

61

Gile, D. (2021). The Effort Models of Interpreting as a Didactic Construct. Muñoz Martín, R., Sun, S.,
Li, D. (eds.). Advances in Cognitive Translation Studies. New Frontiers in Translation Studies, 139–160.
Springer, Singapore. doi: https://doi.org/10.1007/978-981-16-2070-6_7

He, Y., & Wang, J. (2021). Eye tracking uncertainty management in sight translation: Differences between
professional and novice interpreters. In: Muñoz Martín, R., Sun, S., & Li, D. (eds.). Advances in Cognitive
Translation Studies. New Frontiers in Translation Studies, 181–200. Springer, Singapore. doi:
https://doi.org/10.1007/978-981-16-2070-6_9

Jakobsen, A. L., & Jensen, K. T. H. (2008). Eye movement behaviour across four different types of reading
task. In: Göpferich, S., Jakobsen, A. L.; Mees, I. M. (eds.). Looking at eyes: eye-tracking studies of reading
and translation processing, 78–98.

Kasperavičienė, R., Motiejūnienė, J., & Patašienė, I. (2020). Quality assessment of machine translation
output: cognitive evaluation approach in an eye tracking experiment. Texto livre: linguagem e tecnologia,
13(2), 1–16. doi: https://doi.org/10.35699/1983-3652.2020.24399

Kornacki, M. (2019). The application of eye-tracking in translator training. New Insights into Translator
Training. Retrieved October 22, 2025, from https://www.intralinea.org/specials/article/2421

Korpal, P. (2012). On language-pair specificity in sight translation: An eye-tracking study. Übersetzen in
die Zukunft: Tagungsband der 2. Internationalen Fachkonferenz des Bundesverbandes der Dolmetscher und
Übersetzer eV (BDÜ), 522–530. Retrieved October 22, 2025, from https://surl.lu/mubgux

Li, Y., & Zhong, Z. (2024). Visual insights into translation: demystifying trends of adopting eye-tracking
techniques in translation studies. Frontiers in Psychology, 15, 152–168. doi: https://doi.org/10.3389
/fpsyg.2024.1522168

Lörscher, W. (1991). Translation Performance, Translation Process, and Translation Strategies:
A Psycholinguistic Investigation. Tübingen: Gunter Narr.

Ma, X., Li, D., Tsai, J.-L., & Hsu, Y.-Y. (2022). An eye-tracking based investigation into on-line reading
during Chinese-English sight translation: effect of word order asymmetry. Translation & Interpreting: The
International Journal of Translation and Interpreting Research, 14(1), 66–83. doi: https://dx.doi.org
/10.12807/ti.114201.2022.a04

Ma, X. (2021). Coping with syntactic complexity in English-Chinese sight translation by translation and
interpreting students. An eye-tracking investigation. Across Languages and Cultures, 22.2, 192–213. doi:
https://doi.org/10.1556/084.2021.00014

Moorkens, J. (2018). Eye tracking as a measure of cognitive effort for post-editing of machine translation.
In: Walker, C., & Federici, F. M. (eds.). Eye tracking and multidisciplinary studies on translation, 55–69.

Munday, J. (2012). Evaluation in Translation. Critical points of translator decision-making. London,
Routledge. doi: https://doi.org/10.4324/9780203117743

Nitzke, J. (2016). Monolingual post-editing: An exploratory study on research behavior and target text
quality. In: Hansen-Schirra, S., & Grucza, S. (eds.). Eyetracking and applied linguistics, Berlin: Language
Science Press, 83–109.

O’Brien, Sh. (2007). Eye-tracking and translation memory matches. Perspectives: Studies in Translatology,
14. https://doi.org/10.1080/09076760708669037

Pavlović, N., & Jensen, K.T.H. (2009). Eye tracking translation directionality. Translation research
projects, 2, 93–109. Retrieved October 22, 2025, from https://www.intercultural.urv.cat/media/upload
/domain_317/arxius/TP2/jensenpavlovic.pdf

Schaeffer, M., Nitzke, J., Tardel, A., Oster, K., Gutermuth, S., & Hansen-Schirra, S. (2019). Eye-tracking
revision processes of translation students and professional translators. Perspectives, 4, 589–603. doi:
https://doi.org/10.1080/0907676x.2019.1597138

Shreve, G. M., Lacruz, I., & Angelone, E. (2011). Sight translation and speech disfluency performance
analysis as a window to cognitive translation processes. In: Alvstad, C., Hild, A., & Tiselius, E. (eds.).
Methods and strategies of process research: Integrarive approaches in trabslation studies, 93–120.
Amsterdam: John Benjamins.

Sjørup, A.C. (2013). Cognitive effort in metaphor translation: An eye-tracking and key-logging study.
Copenhagen Business School, Frederiksberg. Retrieved October 22, 2025, from https://www.econstor.eu
/handle/10419/208853

Su, W. (2020). Eye-Tracking Processes and Styles in Sight Translation. Springer Singapore. doi:
https://doi.org/10.1007/978-981-15-5675-3


Науковий часопис УДУ імені Михайла Драгоманова

62

Su, W. (2023). Eye-voice span in sight interpreting: an eye-tracking investigation. Studies in Translation
Theory and Practice, 5, 969–985. doi: https://doi.org/10.1080/0907676X.2023.2171800

Su, W., & Li, D. (2019). Identifying translation problems in English-Chinese sight translation: An eye-
tracking experiment. Translation and Interpreting Studies, 14(1), 110–134. doi: https://doi.org/10.1075
/tis.00033.su

Acknowledgments.
This article is based upon work from COST Action MultiplEYE (CA21131), supported by COST

(European Cooperation in Science and Technology).

Бібліографічний опис:
Матвєєва, С., Каспере, Р., (2025). Експериментальний дизайн айтрекінгового

дослідження візуальної уваги в усному перекладі з аркуша. Науковий часопис
Українського національного університету імені Михайла Драгоманова. Серія 9. Сучасні
тенденції розвитку мов, 30, 48–62. https://doi.org/10.31392/UDUnc. series 9.2025.30.05

А н о т а ц і я
Розуміння когнітивних процесів, що лежать в основі усного перекладу з аркуша, є значним

викликом у перекладознавстві. У цій статті розглянуто цей аспект шляхом пропозиції нового,
детально опрацьованого експериментального дизайну для айтрекінгових досліджень. Метою є
системне вивчення ролі візуальної уваги під час виконання усного перекладу з аркуша в реальному часі.

Використовуючи айтрекінг як потужний інструмент, запропонований дизайн дає унікальну
можливість проаналізувати складні когнітивні механізми, що задіяні в цьому процесі. Методологія
пропонує рамковий підхід для дослідження ключових аспектів, зокрема того, як погляд перекладача
впливає на швидкість і якість перекладу, а також потенціал створення когнітивних профілів
(наприклад, виявлення «блочно-орієнтованих» та «лінійних» читачів). Дослідження також інтегрує
методи обробки природної мови для аналізу узгодженості рухів погляду та мовлення й визначення
вузьких місць у процесингу.

Цей експериментальний підхід є важливим з кількох причин. По-перше, він дасть змогу
вдосконалити когнітивні моделі перекладу з використанням об’єктивних індикаторів розумової
активності. По-друге, допоможе виявити характерні патерни уваги у досвідчених перекладачів та
перекладачів-початківців. По-третє, результати можуть сприяти створенню більш орієнтованих на
людину систем комп’ютерної підтримки перекладу. Наше дослідження обіцяє забезпечити глибше
розуміння зв’язку між візуальною поведінкою перекладача та його когнітивними процесами, що
відкриває шлях до розвитку як теорії, так і практики перекладу.

Ключові слова: усний переклад з аркуша, айтрекінг, когнітивні процеси, візуальна увага, обробка
природної мови, когнітивні моделі, комп’ютерна підтримка перекладу.