HNR-Fixed Full-Hybrid RAFT-Edit Audio Demo

Cohort-stratified examples from the post-HNR-fix full-hybrid model. Each row uses one selected source utterance and matched single-axis slider edits.

Age and perceived gender presentation are model-estimated pseudo-labels, not demographic ground truth.
Pitch uses median log-F0; voice quality uses the corrected HNR extraction that removes invalid Praat floor values.
Examples were selected within each cohort for bidirectional response, identity retention, and low off-target movement.

Age manipulation

Age pseudo-label response from the external speech age predictor.

CohortSource / reconstruction-2.0-1.00.0+1.0+2.0
20s M
Source
Reconstruction
value 0.424 · cos 0.96
GLOBE · GLOBE::train::S_020558::00123229_000000.v2.vad
-2.0
value 0.212 · Δ -0.212 · cos 0.62
-1.0
value 0.276 · Δ -0.148 · cos 0.66
0.0
value 0.424 · Δ +0.000 · cos 0.96
+1.0
value 0.658 · Δ +0.235 · cos 0.71
+2.0
value 0.737 · Δ +0.313 · cos 0.65
20s F
Source
Reconstruction
value 0.468 · cos 0.96
LibriTTS · LibriTTS::7177::7177_258965_000015_000001
-2.0
value 0.238 · Δ -0.230 · cos 0.57
-1.0
value 0.313 · Δ -0.155 · cos 0.63
0.0
value 0.468 · Δ +0.000 · cos 0.96
+1.0
value 0.633 · Δ +0.165 · cos 0.70
+2.0
value 0.745 · Δ +0.277 · cos 0.63
40s M
Source
Reconstruction
value 0.583 · cos 0.99
GLOBE · GLOBE::train::S_002060::00057675_000001.v2.vad
-2.0
value 0.195 · Δ -0.388 · cos 0.70
-1.0
value 0.470 · Δ -0.113 · cos 0.81
0.0
value 0.583 · Δ +0.000 · cos 0.99
+1.0
value 0.690 · Δ +0.107 · cos 0.73
+2.0
value 0.726 · Δ +0.143 · cos 0.67
40s F
Source
Reconstruction
value 0.458 · cos 0.93
GLOBE · GLOBE::train::S_000020::00454691_000017.v2.vad
-2.0
value 0.224 · Δ -0.234 · cos 0.49
-1.0
value 0.420 · Δ -0.038 · cos 0.59
0.0
value 0.458 · Δ +0.000 · cos 0.93
+1.0
value 0.594 · Δ +0.136 · cos 0.74
+2.0
value 0.759 · Δ +0.301 · cos 0.59
60s M
Source
Reconstruction
value 0.597 · cos 0.98
GLOBE · GLOBE::train::S_002429::00306976_000012.v2.vad
-2.0
value 0.242 · Δ -0.355 · cos 0.59
-1.0
value 0.456 · Δ -0.141 · cos 0.78
0.0
value 0.597 · Δ +0.000 · cos 0.98
+1.0
value 0.739 · Δ +0.142 · cos 0.67
+2.0
value 0.737 · Δ +0.140 · cos 0.59
60s F
Source
Reconstruction
value 0.592 · cos 0.90
LibriTTS · LibriTTS::8778::8778_246974_000024_000009
-2.0
value 0.193 · Δ -0.399 · cos 0.46
-1.0
value 0.505 · Δ -0.087 · cos 0.65
0.0
value 0.592 · Δ +0.000 · cos 0.90
+1.0
value 0.768 · Δ +0.176 · cos 0.61
+2.0
value 0.742 · Δ +0.150 · cos 0.56

Perceived gender presentation manipulation

Model-predicted male-presentation probability. This is a pseudo-label, not demographic ground truth.

CohortSource / reconstruction-2.0-1.00.0+1.0+2.0
20s M
Source
Reconstruction
value 0.860 · cos 1.00
GLOBE · GLOBE::train::S_011895::00338957_000002.v2.vad
-2.0
value 0.004 · Δ -0.856 · cos 0.81
-1.0
value 0.049 · Δ -0.810 · cos 0.62
0.0
value 0.860 · Δ +0.000 · cos 1.00
+1.0
value 0.994 · Δ +0.135 · cos 0.59
+2.0
value 0.995 · Δ +0.136 · cos 0.61
20s F
Source
Reconstruction
value 0.005 · cos 0.90
VoxCeleb1 · VoxCeleb1::train::id10258::23dSOm3axoU::00003
-2.0
value 0.003 · Δ -0.002 · cos 0.72
-1.0
value 0.003 · Δ -0.002 · cos 0.66
0.0
value 0.005 · Δ +0.000 · cos 0.90
+1.0
value 0.019 · Δ +0.014 · cos 0.58
+2.0
value 0.988 · Δ +0.983 · cos 0.70
40s M
Source
Reconstruction
value 0.975 · cos 0.95
AESRC · AESRC::Indian_G1757::G1757S2334
-2.0
value 0.005 · Δ -0.970 · cos 0.84
-1.0
value 0.865 · Δ -0.110 · cos 0.58
0.0
value 0.975 · Δ +0.000 · cos 0.95
+1.0
value 0.995 · Δ +0.020 · cos 0.57
+2.0
value 0.996 · Δ +0.021 · cos 0.57
40s F
Source
Reconstruction
value 0.094 · cos 0.99
VoxCeleb1 · VoxCeleb1::train::id10968::t3z1N9QWI_8::00005
-2.0
value 0.002 · Δ -0.092 · cos 0.63
-1.0
value 0.002 · Δ -0.092 · cos 0.67
0.0
value 0.094 · Δ +0.000 · cos 0.99
+1.0
value 0.981 · Δ +0.887 · cos 0.74
+2.0
value 0.978 · Δ +0.884 · cos 0.65
60s M
Source
Reconstruction
value 0.983 · cos 1.00
LibriTTS · LibriTTS::4179::4179_25937_000039_000002
-2.0
value 0.000 · Δ -0.983 · cos 0.66
-1.0
value 0.207 · Δ -0.776 · cos 0.80
0.0
value 0.983 · Δ +0.000 · cos 1.00
+1.0
value 0.994 · Δ +0.011 · cos 0.75
+2.0
value 0.989 · Δ +0.006 · cos 0.69
60s F
Source
Reconstruction
value 0.001 · cos 1.00
GLOBE · GLOBE::train::S_013904::00115359_000001.v2.vad
-2.0
value 0.001 · Δ -0.001 · cos 0.64
-1.0
value 0.001 · Δ -0.001 · cos 0.65
0.0
value 0.001 · Δ +0.000 · cos 1.00
+1.0
value 0.065 · Δ +0.063 · cos 0.65
+2.0
value 0.970 · Δ +0.969 · cos 0.65

Pitch manipulation

Habitual pitch response measured with median log-F0.

CohortSource / reconstruction-2.0-1.00.0+1.0+2.0
20s M
Source
Reconstruction
value 4.888 · cos 0.96
GLOBE · GLOBE::train::S_020558::00123229_000000.v2.vad
-2.0
value 4.428 · Δ -0.461 · cos 0.70
-1.0
value 4.516 · Δ -0.373 · cos 0.70
0.0
value 4.888 · Δ +0.000 · cos 0.96
+1.0
value 5.538 · Δ +0.649 · cos 0.64
+2.0
value 5.617 · Δ +0.729 · cos 0.65
20s F
Source
Reconstruction
value 5.033 · cos 0.99
LibriTTS · LibriTTS::3889::3889_9915_000007_000001
-2.0
value 4.418 · Δ -0.614 · cos 0.77
-1.0
value 4.479 · Δ -0.553 · cos 0.82
0.0
value 5.033 · Δ +0.000 · cos 0.99
+1.0
value 5.466 · Δ +0.433 · cos 0.65
+2.0
value 5.637 · Δ +0.604 · cos 0.73
40s M
Source
Reconstruction
value 4.890 · cos 0.99
NaturalVoices · NaturalVoices::MSP-PODCAST_3298::MSP-PODCAST_3298_211
-2.0
value 4.001 · Δ -0.888 · cos 0.73
-1.0
value 4.496 · Δ -0.393 · cos 0.67
0.0
value 4.890 · Δ +0.000 · cos 0.99
+1.0
value 5.308 · Δ +0.418 · cos 0.70
+2.0
value 5.478 · Δ +0.588 · cos 0.66
40s F
Source
Reconstruction
value 4.787 · cos 0.89
NaturalVoices · NaturalVoices::MSP-PODCAST_0478::MSP-PODCAST_0478_1
-2.0
value 4.358 · Δ -0.429 · cos 0.68
-1.0
value 4.459 · Δ -0.328 · cos 0.67
0.0
value 4.787 · Δ +0.000 · cos 0.89
+1.0
value 5.010 · Δ +0.223 · cos 0.55
+2.0
value 5.569 · Δ +0.783 · cos 0.73
60s M
Source
Reconstruction
value 4.791 · cos 0.96
LibriTTS · LibriTTS::2660::2660_173260_000014_000001
-2.0
value 4.292 · Δ -0.499 · cos 0.67
-1.0
value 4.336 · Δ -0.455 · cos 0.65
0.0
value 4.791 · Δ +0.000 · cos 0.96
+1.0
value 5.203 · Δ +0.411 · cos 0.65
+2.0
value 5.421 · Δ +0.630 · cos 0.73
60s F
Source
Reconstruction
value 5.133 · cos 0.87
VoxCeleb1 · VoxCeleb1::train::id10693::EW4Cxe52kL4::00006
-2.0
value 4.391 · Δ -0.742 · cos 0.68
-1.0
value 4.463 · Δ -0.670 · cos 0.70
0.0
value 5.133 · Δ +0.000 · cos 0.87
+1.0
value 5.560 · Δ +0.427 · cos 0.63
+2.0
value 5.690 · Δ +0.557 · cos 0.63

HNR / voice-quality manipulation

Voice-quality response measured with the corrected median HNR estimator.

CohortSource / reconstruction-2.0-1.00.0+1.0+2.0
20s M
Source
Reconstruction
value 5.838 · cos 0.91
GLOBE · GLOBE::train::S_007523::00435966_000002.v2.vad
-2.0
value 2.415 · Δ -3.422 · cos 0.83
-1.0
value 1.476 · Δ -4.362 · cos 0.67
0.0
value 5.838 · Δ +0.000 · cos 0.91
+1.0
value 9.970 · Δ +4.132 · cos 0.68
+2.0
value 9.867 · Δ +4.029 · cos 0.58
20s F
Source
Reconstruction
value 5.616 · cos 0.97
GLOBE · GLOBE::train::S_004523::00062671_000000.v2.vad
-2.0
value 3.988 · Δ -1.628 · cos 0.73
-1.0
value 3.515 · Δ -2.101 · cos 0.55
0.0
value 5.616 · Δ +0.000 · cos 0.97
+1.0
value 9.981 · Δ +4.366 · cos 0.78
+2.0
value 15.009 · Δ +9.393 · cos 0.68
40s M
Source
Reconstruction
value 2.295 · cos 0.99
GLOBE · GLOBE::train::S_002060::00057675_000001.v2.vad
-2.0
value 1.479 · Δ -0.816 · cos 0.71
-1.0
value 1.377 · Δ -0.918 · cos 0.69
0.0
value 2.295 · Δ +0.000 · cos 0.99
+1.0
value 5.167 · Δ +2.872 · cos 0.71
+2.0
value 6.843 · Δ +4.548 · cos 0.64
40s F
Source
Reconstruction
value 4.715 · cos 0.90
LibriTTS · LibriTTS::6782::6782_61316_000007_000010
-2.0
value 4.168 · Δ -0.547 · cos 0.83
-1.0
value 3.704 · Δ -1.010 · cos 0.64
0.0
value 4.715 · Δ +0.000 · cos 0.90
+1.0
value 9.576 · Δ +4.861 · cos 0.75
+2.0
value 19.644 · Δ +14.930 · cos 0.62
60s M
Source
Reconstruction
value 3.917 · cos 0.98
GLOBE · GLOBE::val::S_010055::00002970_000003.v2.vad
-2.0
value 1.424 · Δ -2.493 · cos 0.84
-1.0
value 0.857 · Δ -3.060 · cos 0.70
0.0
value 3.917 · Δ +0.000 · cos 0.98
+1.0
value 5.800 · Δ +1.883 · cos 0.83
+2.0
value 11.297 · Δ +7.380 · cos 0.72
60s F
Source
Reconstruction
value 7.832 · cos 1.00
GLOBE · GLOBE::train::S_013904::00115359_000001.v2.vad
-2.0
value 6.753 · Δ -1.079 · cos 0.68
-1.0
value -1.663 · Δ -9.495 · cos 0.67
0.0
value 7.832 · Δ +0.000 · cos 1.00
+1.0
value 11.334 · Δ +3.502 · cos 0.80
+2.0
value 20.659 · Δ +12.827 · cos 0.64