Session Report

Discover BEAM X1
Actual duration: 4:07 min · Scenario: platform-demo v1.0
2 April 2026, 13:17 · Completed
During 247 seconds, the following was captured: 245 camera frames, 122 blendshape records (52 muscle values), 49 presence checks, 3 voice comments, and 5 gestures. Face detected in 100% of frames.
This is a demo version. Data accuracy depends on camera quality, face distance, ambient lighting, and noise. For professional results, we recommend a controlled environment with good face and eye visibility.
Device Chrome 146.0.0.0
Windows 10/11
1536 × 864 · 20 jader · 8 GB
Europe/Prague · cs-CZ
Battery: 100% (charging)
Connection: 4g
4:07
Duration
10
Video steps
3
Comments
69%
Attention (gaze)
3%
Iris tracking (exp.)
47%
Fatigue
5
Gestures
84 cm
Ø Distance

🧠 Session Analysis

The participant completed the entire scenario without interruption and with consistent concentration. Face was detected in 100% of frames, meaning they sat facing the screen the entire time without a single look away. Dominant emotion Focused (97.6%) combined with average gaze attention of 69% and stable screen distance indicate active and engaged content viewing. Interactions via voice tips and a Thumbs Up gesture confirm the participant was not passive but actively engaged in the scenario.
80
Engagement
🎯 Attention per Segment

Highest gaze attention was measured at Pillar 1: Biometrics (76.2 %) and Introduction (74.2 %). The lowest was at Pillar 2: Hardware (60.9 %). This suggests the biometrics topic engaged the participant significantly more than the technical hardware description.

Introduction
74.2
Pillar 1: Biometrics
76.2
Tip: data points
73.9
Answer: points
68.9
Pillar 2: Hardware
60.9
Tip: devices
63.1
Answer: devices
67.7
Pillar 3: Adaptivity
68.3
Choice
68.4
Conclusion
64.8
😴 Fatigue Progression

Fatigue gradually increased from 35 % at the start to 55 % at Pillar 2 Hardware — the longest video segment after the introduction. Interestingly, at Pillar 3: Adaptivity, fatigue slightly dropped to 48.6%, suggesting this topic re-energized and engaged the participant. The session minimum (15 %) appeared as a brief fluctuation around the 182nd second.

IntroductionPillar 1Pillar 2Pillar 3Conclusion
😊 Emotional Map

Emotions Happy (4 frames) and Neutral (2 frames) appeared exclusively during Pillar 1: Biometrics (41–69s). This precisely correlates with the Thumbs Up gesture captured at 58–62s. This segment triggered the strongest positive reaction of the entire session.

The rest of the scenario ran in Focused mode — stable concentration without significant emotional fluctuations. This is consistent with the information-dense content of the remaining pillars.

🎙 Interactions and Reactions

The participant answered both tip questions by voice:

Tip 1 (data points/s): answer "22", reaction time 4 977 ms
Tip 2 (device types): answer "50", reaction time 4 755 ms

Very similar reaction times (~5s) indicate consistent and active thinking before answering, not random guessing. For the final choice, the participant said "chci do reportu" — a natural formulation instead of just "no", indicating understanding of the scenario context.

📏 Physical Behavior

Screen distance ranged from 81–92 cm with an average of 84 cm. A slight distancing trend (from 83 cm at start to 92 cm at end) is a natural manifestation of gradual relaxation during a longer session.

Head movement was minimal — averaging ±3° pitch and ±4° yaw. Combined with low blink rate (5/min, norm 15–20/min), this indicates high visual fixation on the screen. Lower blinking may partly be a detection artifact, but also corresponds to a state of concentration.

📊 Overall Participant Profile

Type: Focused, calm, analytical observer. Watched content without distraction, responded to interactive elements thoughtfully, showed no signs of impatience or loss of interest.

Highlight: Biometrics (Pillar 1) was clearly the most interesting topic — highest attention, only occurrence of positive emotions, approval gesture. Hardware (Pillar 2) was conversely the least engaging segment with highest fatigue and lowest attention.

Content recommendation: Consider shortening or revitalizing the hardware segment. The biometrics section serves as the main "hook" of the scenario.

🔬 Advanced Analysis — Contextual Metrics

👁 Cognitive Load (blink suppression)

Measured blink rate 5/min is significantly below the physiological norm of ~20 blinks/min. Research (Magliacano et al., 2020; Frontiers in Human Neuroscience, 2017) shows that lower blink frequency during visual tasks correlates with higher cognitive load — the brain suppresses blinking to minimize loss of visual input.

Of 8 detected blinks, 5 occurred during Pillar 1 (53–66s), then silence until Pillar 3 (182s, 186s) and conclusion (222s). This clustering corresponds to the "attentional blink" phenomenon — blinks naturally concentrate into moments of lower cognitive load or transitions between segments.

Ref: Magliacano et al. (2020), Neuroscience Letters; Frontiers in Human Neuroscience (2017), doi:10.3389/fnhum.2017.00620

💪 Facial Muscle Analysis (FACS)

Of 52 ARKit blendshapes, consistently elevated values show:

browDown L/R (AU4): avg 0.54 / 0.58 — lowered eyebrows, typical concentration marker
eyeSquint L/R (AU7): avg 0.32 / 0.33 — narrowed eyes, focused gaze
mouthPress L/R (AU24): avg 0.12 / 0.11 — pressed lips, tension during thinking
mouthSmile L/R (AU12): avg 0.016, max 0.891 — one brief but intense smile

The AU4 + AU7 combination is a classic FACS (Facial Action Coding System) pattern for "concentrated examination". High max mouthSmile (0.89) confirms a moment of joy during Pillar 1.

📈 Attention Variability (gaze std)

Standard deviation of gaze attention reveals engagement quality — higher std = more dynamic attention shifting = more active content processing:

Pillar 1 Biometrika: std = 8.2 (rozsah 59–91) — most dynamic, active processing
Answer 1: std = 11.0 (range 51–85) — highest variability, processing new information
Pillar 2 Hardware: std = 5.2 (rozsah 51–84) — lowest variability, monotonous watching
Tip 1 (question): std = 5.0 — stable focus during answer deliberation

Low variability at Pillar 2 supports the conclusion that the hardware topic did not trigger active cognitive processing unlike biometrics.

🔄 Significant Head Movements

For most of the session, the head moved within ±1–2°. However, two notable deviations reveal interesting moments:

t = 92.9s: yaw spike to +22° — sudden head turn to the right. Occurred at the transition from Pillar 1 to Pillar 2. Possibly a reaction to room sound or physical position adjustment.
t = 154.9–155.9s: yaw +8° — shorter turn at the end of Pillar 2 reveal, again at segment transition.

Average movement per segment: Introduction 0.2° (nearly motionless) vs Answer 1: 2.17° (10× more) — processing the answer triggered a physical reaction.

📊 Correlation: fatigue ↔ attention

Pearson correlation between fatigue and gaze attention: r = −0.448 (n = 245). This is a medium negative correlation — as fatigue increases, attention decreases, which is an expected physiological pattern.

This means both indicators measure a consistent phenomenon and mutually validate each other. At the same time, the correlation is not too high (−0.45 vs −1.0), showing that fatigue is not the only factor affecting attention — content attractiveness also plays a role (see differences between pillars).

Distraction and Recovery Detection

The algorithm identified 9 attention drops > 15 points and 10 recovery spikes within the session. Most interesting patterns:

t = 92.9s: 32-point drop (largest) — correlates with sudden head turn (yaw +22°)
t = 126.9s a 136.9s: 23-point drop — both in Pillar 2 Hardware, confirming lower engagement
t = 218.9s: 22-point drop → immediate +27-point recovery at t = 219s — momentary distraction with rapid self-regulation

Rapid attention recoveries after drops indicate high self-regulation ability — the participant always quickly returned to the content.

📐 Ergonomics and Distance

Average distance 84 cm is almost exactly at the Resting Point of Accommodation (~80 cm per CCOHS), which is the distance at which eye muscles require no effort to focus. OSHA ergonomic recommendation states an ideal range of 50–100 cm.

Session drift: +1.6 cm (from 84 to 85.6 cm) — minimal distancing is a natural manifestation of relaxation, not loss of interest. At session end (from 236s), distance increased to 87–92 cm, correlating with the final video phase and transition to report.

Ref: CCOHS Monitor Positioning Guidelines; OSHA eTools Computer Workstations

🎤 Voice Activity and Ambient

Of 246 microphone samples: speaking detected only (t = 79–80s, during answer "22"), but non-zero volume 37×. This indicates a quiet environment with occasional ambient noise.

Dominant frequency during speech: 750 Hz — corresponds to typical male voice range (85–800 Hz fundamental + harmonics). RMS during speech (124) vs ambient (5–6) shows clear signal-to-noise separation.

Interesting: 3 voice comments were recognized by speech-to-text (confidence 93%), but the speaking detector captured only the first — this indicates room for speaking detection threshold calibration.

This analysis is derived exclusively from collected biometric data and scenario context. This is not AI-generated content — it is a deterministic evaluation of BEAM X1 platform metrics. The advanced section references published research: Magliacano et al. (2020) — blink rate and cognitive load; Frontiers in Human Neuroscience (2017) — blink rate variability; Ekman & Friesen FACS — facial action coding; CCOHS & OSHA — ergonomic standards for monitor distance. Interpretation is indicative and depends on captured data quality.

Session Overview

Sensor Status

Camera — active from start
Microphone — active from start
Speaker — hlasitost 90%
MediaPipe — start za 0.9s
Face detection — 100% (245/245)
Presence checks — 49× (always 1 person)

Biometric Summary

🎯Attention (gaze)69%51% — 91%
👁Iris tracking (exp.)3%0% — 56%
📏Distance84 cm81 cm — 92 cm
👀Blink Rate5/min
😴Fatigue47%15% — 55%
🔄Pohyb hlavyP±3° Y±4°

✋ Detected Gestures (5)

👍 Thumbs Up 57.9s – 61.9s

Video timeline

5.0s
Introduction video
36s · welcome-intro
41.1s
Pillar 1: Biometrics video
28s · pillar-1-biometrics
68.8s
Tip: data points/s voice-tip
14s · pillar-1-question
🎙 Tip: "22" · Reaction time: 4 977 ms
82.4s
Answer: data points video
16s · pillar-1-reveal
98.8s
Pillar 2: Hardware video
32s · pillar-2-hardware
131.0s
Tip: device types voice-tip
13s · pillar-2-question
🎙 Tip: "50" · Reaction time: 4 755 ms
143.7s
Answer: device types video
17s · pillar-2-reveal
160.5s
Pillar 3: Adaptivity video
42s · pillar-3-adaptivity
203.0s
Choice: want to know more? voice-choice
22s · closing-question
🗣 Choice: NO (0× warnings)
🎙 "chci do reportu"
224.7s
Transition to report video
20s · closing-to-report

Voice Comments (3)

80.5s"22"93%
142.3s"50"93%
223.8s"chci do reportu"93%

Emotion Distribution

🔵 Focused
97.6%
😊 Happy
1.6%
😐 Neutral
0.8%
Emotions are derived from facial blendshapes (facial muscle values), not from an AI model.

Biometric Charts

Attention (gaze) & Fatigue over Time
Attention (gaze) Fatigue
100%50%0%
0s247s
Screen Distance
cm
95cm85cm75cm
0s247s
Blink Rate (cumulative/min)
/min
630
0s247s
Head Movement (Pitch / Yaw)
Pitch (nodding) Yaw (turning)
+25°-25°
0s247s

Session Log (39 events)

Presence Detection (49 checks)

Every ~5s, a check of the number of people in the frame was performed. In all 49 checks, exactly 1 person was detected.

Collected Data Volume

📸Camera frames245~1/s
😀Emotion records245~1/s
🎯Attention records245~1/s
📏Distance records245~1/s
😴Fatigue records245~1/s
💪Blendshape records12252 values/record
📍Landmark records122478 points/record
🎤Microphone records246vol/rms/freq/zcr
📷Periodic photos122~2/s
🖼Face photos12at key moments