Abstract
JWST has revealed a set of high-redshift anomalies that appear disparate in detail but share a common structure: star formation efficiencies exceeding $\Lambda$CDM limits, overmassive black holes, and anomalous stellar-to-dynamical mass ratios all appear preferentially in the deepest gravitational potentials. This work tests whether that common pattern can arise from a single violation of the isochrony axiom. In the Temporal Equivalence Principle (TEP), a chameleon-screened scalar-tensor theory, proper time depends on environment in unscreened halos. Using the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$ with no JWST retuning, the framework fully resolves the Red Monster efficiency excess and provides a physical route to differential black-hole growth in Little Red Dots.
The strongest current direct test is a kinematic comparison using the JWST-SUSPENSE survey of massive quiescent galaxies at $z = 1.2$–$2.3$ ($N = 15$). A fundamental vulnerability of evaluating TEP photometrically is mass-proxy circularity, as $\Gamma_t$ depends on the gravitational potential. By employing dynamically measured masses ($M_{\rm dyn}$) from stellar velocity dispersions and spectral ages derived from absorption features, the SUSPENSE analysis tests a dynamical-potential predictor and photometric stellar mass side by side. The central comparison shows that $\Gamma_t$ predicts spectral age more strongly than stellar mass, yielding $\rho({\rm Age}, \Gamma_t \mid z) = +0.733$ ($p = 1.9 \times 10^{-3}$) compared to $\rho({\rm Age}, M_* \mid z) = +0.493$ ($p = 0.062$). Under joint control of the competing predictor and redshift, $\Gamma_t$ retains a residual association with age, $\rho({\rm Age}, \Gamma_t \mid M_*, z) = +0.624$ ($p = 0.0129$), whereas stellar mass contributes no residual signal once $\Gamma_t$ is controlled, $\rho({\rm Age}, M_* \mid \Gamma_t, z) = -0.036$ ($p = 0.898$). Propagating the published asymmetric uncertainties for all 15 galaxies preserves a positive $\Gamma_t$ residual in 99.7\% of Monte Carlo draws. This one-sided residual structure supports the interpretation that galaxy evolution scales more closely with gravitational potential depth than with baryonic mass alone, and it materially narrows the photometric circularity objection.
The primary large-sample JWST evidence comes from the photometric anomalies that motivated the theory, treated here as two primary empirical lines across three surveys ($N = 4{,}726$). A key model-discriminating result is the Uniformity Paradox: dust and accelerated evolution switch on selectively with potential depth ($\rho = +0.62$ at $z > 8$). Any standard-physics explanation that adjusts a time-uniform ingredient, such as enhanced AGB yields, would predict dust to become broadly ubiquitous or to follow star formation, rather than tracking gravitational depth. The effective-time coordinate organizes this dust signal better than raw cosmic time, passing a dedicated validation battery with $\rho(t_{\rm eff}, A_V \mid t_{\rm cosmic}) = +0.600$ ($p = 5.0 \times 10^{-29}$). Similarly, the mass–sSFR relation at $z > 7$ shows a strong partial correlation with $\Gamma_t$ ($\rho = -0.49$, $p = 10^{-18}$).
The same mapping also relieves the benchmark stellar-mass-function and cosmic-SFRD excesses. Full CAMB Boltzmann integration remains consistent with Planck constraints ($\sigma_8$ within $0.1\sigma$). The combination of the large-sample photometric lines, the direct kinematic comparison, and the structural necessity of the Uniformity Paradox supports TEP as the most coherent presently available explanation for the early-universe anomalies considered here.
Keywords: Cosmology: early universe – Galaxies: high-redshift – Galaxies: evolution – Gravitation – Scalar-tensor theories – Infrared: galaxies
1. Introduction
1.1 Observational Tensions
JWST has revealed a coherent pattern of anomalies at $z > 5$ that strains the standard framework for inferring stellar properties from photometry. The most visible example is the class of spectroscopically confirmed "Red Monsters" (Xiao et al. 2024), whose stellar masses ($M_* \gtrsim 10^{11}\,M_\odot$) imply baryon-to-star conversion efficiencies of $\sim 0.50$, more than double the $\sim 0.20$ theoretical maximum imposed by feedback in $\Lambda$CDM halos. Within the Boylan-Kolchin (2023) framework, the discrepancy reaches $11\sigma$. This tension is not isolated. The UNCOVER UV luminosity function at $z > 9$ implies a star formation rate density exceeding the halo accretion limit by factors of 4–10 (Chemerynska et al. 2024).
A second anomaly appears in the population of "Little Red Dots" (LRDs): compact, red sources hosting broad-line AGN whose supermassive black holes ($M_{\rm BH} \sim 10^7$–$10^8\,M_\odot$) are 10–100 times more massive relative to their hosts than local scaling relations predict ($M_{\rm BH}/M_* \sim 0.1$ vs. $0.001$; Matthee et al. 2024). Growing such objects from stellar seeds requires near-continuous super-Eddington accretion despite their high observed space density ($n \sim 10^{-5}\,\mathrm{Mpc}^{-3}$). A third anomaly emerges in JWST NIRSpec kinematics: at $z \gtrsim 3$–4, massive quiescent galaxies show $M_*/M_{\rm dyn} \gtrsim 1$ (Esdaile et al. 2021; Tanaka et al. 2019), while at $z > 5.5$ low-mass star-forming systems show the opposite extreme, with dynamical masses exceeding stellar masses by up to a factor 40 (de Graaff et al. 2024a). Both patterns are inconsistent with standard expectations and together suggest a mass-dependent bias in photometric stellar-mass estimates.
Across these cases, the common structure is the same: stellar masses and ages inferred from photometry appear systematically too large, too early, in precisely the environments with the deepest gravitational potentials.
1.2 Challenging Isochrony with TEP
Underlying every photometric inference of stellar age, mass, and star formation rate is the isochrony axiom: the assumption that the clock governing stellar evolution ticks at the universal cosmic rate, regardless of local gravitational environment. Under isochrony, an observed red colour is interpreted as a combination of age, dust, and metallicity, and the resulting mass-to-light ratio ($M/L \propto t^n$) is treated as universal. If this axiom is violated — if stars in deep gravitational potentials accumulate proper time faster than the cosmic mean — then SED-inferred masses and ages are systematically inflated in precisely the environments where JWST finds the largest anomalies.
The Temporal Equivalence Principle (TEP) formalises this possibility within a conformally coupled scalar-tensor framework with chameleon screening (Khoury & Weltman 2004; Brax et al. 2004; Burrage & Sakstein 2018). In unscreened halos, the conformal factor $A(\phi) > 1$ causes stellar clocks to tick faster relative to coordinate time, so that a galaxy's effective age $t_{\rm eff} = \Gamma_t\,t_{\rm cosmic}$ exceeds its cosmic age. The resulting bias in $M/L$ inflates inferred stellar masses by $\Gamma_t^n$, directly mimicking a star formation efficiency excess. This effect is physically distinct from standard gravitational redshift: photons still lose energy climbing out of potential wells (kinematic redshift, fully preserved in TEP), while the scalar field coupling independently accelerates atomic processes within the well. Both effects coexist; only the latter biases photometric mass inference.
Physics Note: Dilation vs. Enhancement
It is essential to distinguish between two relativistic effects:
- Kinematic Gravitational Redshift (Standard GR): Photons lose energy climbing out of potential wells. This affects light and is fully preserved in TEP.
- Dynamical Clock Rate (TEP): The scalar field coupling modifies the effective mass of particles, changing the rate at which atomic clocks tick relative to coordinate time. In the TEP framework, the scalar field in diffuse halos relaxes to values where $A(\phi) > 1$, causing clocks to tick faster (enhancement) than the cosmic mean, even while photons suffer redshift.
The temporal shear effect is governed by an enhancement factor $\Gamma_t$:
The zero-parameter JWST predictions adopt the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$, calibrated from local Cepheid observations ($z \approx 0$), and apply it unchanged to the high-redshift universe. The JWST observables are then examined to test whether they recover a mutually consistent coupling internally, yielding conditional concordance values near $\alpha_0 \approx 0.55$ in the high-redshift analyses. These internal recoveries are treated as self-consistency checks rather than as replacement calibrations. No parameters are tuned to the JWST data the model seeks to explain. The structural choices in the $\Gamma_t$ formula — the reference redshift $z_{\rm ref} = 5.5$, reference halo mass $\log M_{h,\rm ref} = 12.0$, exponential functional form, and $\sqrt{1+z}$ coupling scaling — were fixed by the scalar-tensor framework in prior papers (Papers 1 and 7); all have independent physical motivation and none were adjusted to improve JWST fits (§4.11.1).
1.3 Reader's Guide to the Evidence
This manuscript is easiest to read as a progression from prediction to population tests to physical interpretation. Because the analysis spans photometric, kinematic, and spatial data across multiple JWST surveys, the evidence is organized explicitly by role rather than presented as a flat list of correlations:
- Stage 1: The zero-parameter prediction (§3.2). The local-universe TEP calibration is applied directly to the most extreme $z > 5$ anomalies (the Red Monsters) to test whether it predicts the scale of the star formation efficiency correction without any JWST tuning.
- Stage 2: Two primary empirical lines, one ancillary spatial indication, one derived regime-level comparison, and one direct kinematic test (§3.0–3.10). The statistical core of the manuscript is built from three JWST surveys ($N = 4{,}726$), supplemented by a direct JWST-SUSPENSE kinematic comparison. The branches are separated according to evidential weight and independence:
- L1. Dust–$\Gamma_t$ emergence: At $z > 8$, massive galaxies are dusty while low-mass galaxies remain dust-poor ($\rho = +0.62$). The signal strengthens with redshift and is organized by an AGB-like effective-time threshold. A Fisher combination across the three photometric surveys yields $24.4\sigma$.
- L2. Inside-out core screening (ancillary spatial indication): JADES resolved photometry and the preferred DR5 direct-mass morphology sample indicate more centrally screened, bluer-core structures in the more massive systems. Because the direct gradient predictor-comparison and residual tests remain non-significant, this branch is retained as ancillary rather than counted as a primary line.
- L3. Mass–sSFR inversion: The standard downsizing correlation reverses sign at $z > 7$, consistent with time-dilation inflating the apparent sSFR in massive early halos. The discontinuous sign change makes this test difficult to mimic with smooth mass-measurement systematics.
- L4. Dynamical mass comparison (derived regime comparison): A matched regime-level kinematic comparison indicates that the TEP correction can reconcile anomalously high $M_*/M_{\rm dyn} > 1$ cases using SED-independent kinematics. A supplementary five-object direct literature ingestion at $z = 3.2$–$4.0$, including one conservative upper-limit row, is directionally consistent and materially strengthens the kinematic rebuttal to the photometric mass-proxy objection, but the higher-$z$ regime-level branch remains the authoritative derived comparison rather than a primary empirical line.
- L5. Direct kinematic test: The JWST-SUSPENSE comparison directly evaluates $\Gamma_t$ derived from $M_{\rm dyn}$ against photometric $M_*$ in predicting spectral age. It is the strongest current direct test on the mass-circularity objection, but because the sample is still small ($N = 15$) it is carried as a direct kinematic test with an explicit small-sample caveat rather than as one of the two primary large-sample lines.
- Stage 3: The Little Red Dot mechanism (§4.9). The framework is then applied to the overmassive black-hole problem, where differential temporal shear between the dense core and the diffuse stellar halo provides a route to rapid apparent growth without requiring sustained extreme super-Eddington accretion.
1.4 Prior Cross-Domain Evidence for TEP
The JWST analysis presented here is not the first test of the TEP framework. A local Cepheid analysis (Paper 12) provides an external prior $\alpha_0 = 0.58 \pm 0.16$, and the present high-redshift study asks whether JWST observables independently recover a consistent coupling across multiple domains spanning 13.5 Gyr of cosmic time (Table 11, §4.2). Important caveat: all prior constraints in Table 11 derive from the same author's analysis programme; the TEP series has not yet undergone independent replication or peer review in a refereed journal. Readers should weigh the cross-domain consistency with this single-source limitation in mind (§4.11a). The three domains most directly used in this work are:
- Hubble Tension: Stratification of $N = 29$ SH0ES Cepheid hosts by velocity dispersion reveals an environmental bias. High-$\sigma$ hosts yield $H_0 = 72.45 \pm 2.32$ km/s/Mpc; low-$\sigma$ hosts yield $H_0 = 67.82 \pm 1.62$ km/s/Mpc — consistent with Planck within $1\sigma$. The TEP correction with the external prior $\alpha_0 = 0.58$ yields $H_0^{\rm TEP} = 68.66 \pm 1.51$ km/s/Mpc ($0.79\sigma$ from Planck). This provides the external low-redshift anchor used in this work.
- Globular Cluster Pulsars: Analysis of 380 millisecond pulsars reveals a $0.13$ dex spin-down excess in cluster pulsars ($p = 1.7 \times 10^{-15}$). The environmental screening threshold $\sigma > 165$ km/s derived from this population is used directly in §2.3.2.2 and §4.4.3 of this work.
- Universal Critical Density: GNSS atomic clock networks yield $\rho_c \approx 20$ g/cm³, independently confirmed by the SPARC rotation curve slope and magnetar critical periods. This $\rho_c$ is used as the screening threshold in this work.
The central question this work addresses is whether the same externally calibrated $\alpha_0$ that resolves the Hubble tension and accounts for pulsar timing anomalies also predicts the high-redshift galaxy anomalies, with no re-tuning. The dust-only and joint high-redshift concordance analyses both recover values near $\alpha_0 \approx 0.55$, consistent with the external Cepheid prior at $0.2\sigma$. The tighter internal JWST interval is interpreted as agreement among the high-redshift observables rather than as a replacement of the local prior; the Cepheid calibration remains the primary external anchor for $\alpha_0$.
1.5 Alternative Explanations
There is no shortage of standard-physics explanations for JWST's high-redshift surprises. Proposed mechanisms include top-heavy initial mass functions, bursty or ultra-efficient star formation, early black hole seeding, strong AGN contamination, dust geometry effects, and selection/systematic biases in the spectral-energy-distribution (SED) fitting procedures. The present work does not dismiss these a priori; instead, it evaluates whether they can reproduce the specific temporal and structural signatures in the data.
Standard alternatives include top-heavy initial mass functions (Boylan-Kolchin 2023), enhanced AGN feedback, bursty star formation, and super-Eddington accretion. Each can partially address one or two of the observed anomalies, but not the full set considered here. AGN feedback predicts a negative dust–black hole mass correlation, as AGN activity clears dust; the observed relation is positive ($\rho = +0.62$). Bursty star formation predicts bluer colours during burst phases, whereas the TEP-enhanced population is significantly redder at fixed magnitude ($\rho(M_{\rm mag}, \text{color}) = -0.40$, $p = 2.8 \times 10^{-16}$, $N = 398$). Top-heavy IMFs can partially relieve the star formation efficiency crisis but offer no mechanism for the spatially resolved screening gradients, the mass–sSFR inversion, or the differential black hole growth in Little Red Dots. In the systematic comparison across the live JWST core evidence package (§4.3.5), TEP is the only framework in that comparison that captures both primary empirical lines while remaining directionally consistent with the ancillary L2 indication and the derived L4 regime comparison. No alternative in that comparison reproduces the cross-domain consistency of $\alpha_0$ across instruments, epochs, and physical regimes.
Key Limitations
Several important limitations should be borne in mind when evaluating the evidence presented here:
- Mass circularity and the Kinematic Resolution: In purely photometric samples, because $\Gamma_t$ is derived from halo mass, distinguishing TEP effects from intrinsic mass-dependent evolution requires careful partial-correlation analysis and exploitation of the independent redshift-dependent component. The current SUSPENSE kinematic comparison materially narrows this objection by testing a dynamical-potential predictor and photometric stellar mass directly against spectral age. Its strongest discriminant is a one-sided residual structure: the dynamical predictor retains significant age information after $M_*$+$z$ control, whereas $M_*$ does not after $\Gamma_t$+$z$ control.
- Spectroscopic sample size: Two public spectroscopic compilations substantially improve the earlier small-sample situation. JADES DR4 (D'Eugenio et al. 2025) provides 2,858 good-quality spec-z with 118 at $z > 7$. The DJA NIRSpec Merged Table v4.4 (Brammer et al.; September 2025) provides 19,445 unique grade-$\ge 3$ sources from all public JWST/NIRSpec programs, with 698 at $z > 7$ and 234 at $z > 8$. These compilations materially strengthen the spectroscopic context, although some analysis-specific highest-redshift slices remain small. Stellar masses also still rely on photometric estimates ($\pm 0.3$–$0.5$ dex), so the spectroscopic branches remain supportive consistency checks rather than independent primary lines of evidence. In the main evidence synthesis, the SUSPENSE kinematic comparison provides the strongest direct evidence on the mass-circularity question. The three-survey photometric L1 combination ($24.4\sigma$) and the L3 mass–sSFR inversion provide the two large-sample primary empirical lines, while the dedicated UNCOVER $z > 8$ dust battery and DJA-based GOODS-S / H$\alpha$/H$\beta$ branches are treated as robustness or supplementary checks.
- Theoretical foundation: The enhancement factor $\Gamma_t > 1$ is motivated by a scalar-tensor, two-metric construction in which matter couples to a conformal factor $A(\phi)$ and screening suppresses the effective coupling in dense regimes. The first-principles action, field equations, screening mechanism, and PPN consistency mapping provide the foundation for this framework; this manuscript presents only the components required to define and test the observational mapping (§2.3.2). A full joint cosmological parameter inference is outside the scope of this work.
- Red Monsters case study: The $N = 3$ illustrative case study is not statistically robust in isolation. Population-level tests ($N = 2{,}315$) provide the primary evidence.
Section 2 defines the TEP mapping, the datasets, and the statistical procedures. Section 3 presents the two primary empirical lines, the direct kinematic decisive test, and then places the ancillary spatial indication, the derived regime-level comparison, and the supplementary replications in their proper evidential order. Section 4 interprets the results in the broader theoretical context, including precision-GR consistency, the link to the Hubble tension, and the Little Red Dot mechanism. Section 5 closes with falsification criteria and observational predictions. Appendix A provides the theoretical foundation (action, field equations, screening mechanism), and Appendix B documents key computational definitions and reference tables.
2. Data and Methods
This section follows the same logic as the manuscript as a whole. It first defines the observational datasets, then the derived TEP quantities, then the statistical tests used to separate genuine TEP signatures from mass-proxy artifacts, and finally the black-hole simulation used for the Little Red Dot mechanism. The aim is to state the observational mapping clearly enough that each empirical result in §3 can be read directly back to its data and assumptions.
2.1 Data
2.1.1 Red Monsters (FRESCO)
The motivating case study is the class of ultra-massive galaxies at $z \sim 5$–$6$ exemplified by the three spectroscopically confirmed "Red Monsters" reported by Xiao et al. (2024). For the illustrative TEP prediction (§3.1), representative parameters spanning the published range ($z \approx 5.3$–$5.9$, $\log M_* \approx 10.8$–$11.2$, SFE $\approx 0.50$) are adopted. These capture the regime where the anomaly is most acute. The resulting SFE correction fully resolves the anomaly (corrected SFE $\sim 0.11$, well below the $\Lambda$CDM limit of 0.20), with the correction depending primarily on $\Gamma_t$ (set by halo mass and redshift via the pre-calibrated TEP formula) and insensitive to the precise input SFE at the $\lesssim 2\%$ level.
2.1.2 UNCOVER DR4
For population-level tests, the UNCOVER DR4 stellar population synthesis catalog is used (Wang et al. 2024; Furtak et al. 2023). The analysis applies quality cuts and constructs a high-redshift sample with $4 < z < 10$ and $\log M_* > 8$, yielding $N = 2{,}315$ galaxies. For multi-property analyses (age ratio, metallicity, dust), a subset with complete measurements is used (e.g., $N = 1{,}108$ for the partial-correlation and split tests).
2.1.3 Independent replications and spectroscopic validation
To evaluate independent replication of the $z > 8$ dust result, catalogs for CEERS are used (Cox et al. 2025; Finkelstein et al. 2023; photometric redshifts via EAZY, Brammer et al. 2008) and COSMOS-Web (Shuntov et al. 2025). The COSMOS2025 catalog (Shuntov et al. 2025) provides LePHARE SED-derived stellar masses, SFRs, E(B-V) dust, and ages for 784,016 galaxies over 0.54 deg², with 37,965 sources at $z > 4$ passing quality cuts — the largest high-$z$ photometric SED sample used in this analysis. The UNCOVER DR4 SPS catalog (Wang et al. 2024; Suess et al. 2024; Price et al. 2025) uses 20-band MegaScience photometry and Prospector-β SED fitting, providing 2,628 sources at $z > 4$ with Prospector dust2 and a spec-z sub-catalog of 203 sources with spectroscopic redshifts fixed in the SED fit. For spectroscopic validation, two complementary catalogs are used:
JADES Data Release 4 (D'Eugenio et al. 2025; Curtis-Lake et al. 2025; Scholtz et al. 2025): 2,858 high-quality spectroscopic redshifts (flags A/B) across GOODS-N and GOODS-S, with 118 sources at $z > 7$ and 41 at $z > 8$. UV-luminosity-based stellar masses (Song et al. 2016) are derived for the 1,345 sources with valid $M_{\rm UV}$.
DAWN JWST Archive (DJA) NIRSpec Merged Table v4.4 (Brammer et al.; de Graaff et al. 2024a; Heintz et al. 2023; September 2025): a comprehensive compilation of 80,367 uniformly reduced JWST/NIRSpec spectra from all public programs, processed with the msaexp/grizli reductions. After applying grade $\ge 3$ quality cuts and deduplication by sky position, 19,445 unique sources are retained, of which 3,251 are at $z > 5$, 698 at $z > 7$, and 234 at $z > 8$. Photometric stellar masses are available for 2,598 of the high-$z$ sources. This catalog spans JADES, CEERS, RUBIES, UNCOVER, GLASS, PRIMER, and more than 50 other public programs, providing the largest uniform cross-survey spectroscopic sample to date.
2.1.4 MIRI-based mass calibration context
Recent JWST/MIRI analyses (Pérez-González et al. 2024) show that NIRCam-only SED fits can overestimate stellar masses at $z > 5$ because of age-attenuation degeneracy and emission-line contamination. When MIRI photometry is included, the number density of the most massive systems decreases and some candidates are reclassified as dusty or line-dominated sources. The photometry is not reprocessed in this work, but published masses are treated as conservative upper bounds and MIRI-based studies serve as an external check on the interpretation of the extreme-mass tail.
| Dataset | Role | Sample Size | Redshift Range | Mass Cut ($\log M_*$) | Key Reference | Key Biases |
|---|---|---|---|---|---|---|
| Red Monsters | Case Study | 3 | $5.3 < z < 5.9$ | $> 10.5$ | Xiao et al. (2024) | Small N, Selection Function |
| UNCOVER DR4 | Primary Statistical Sample | 2,315 | $4 < z < 10$ | $> 8.0$ | Wang et al. (2024) | NIRCam Mass Overestimation |
| CEERS DR1 | Independent Replication | 82 | $z > 8$ | $> 8.0$ | Cox et al. (2025) | Field Variance |
| COSMOS-Web | Large-Volume Check | 2,606 (918 dust-detected) | $z > 8$ | $> 8.0$ | Shuntov et al. (2025) | Photometric Redshift Uncertainties; Zero-Inflated Dust |
| JADES DR4 (NIRSpec/MSA) | Spectroscopic Validation | 2,858 (flags A/B); 118 at $z > 7$ | $z = 0.1$–$14.2$ | None | D'Eugenio et al. (2025); Curtis-Lake et al. (2025) | Slit Losses; UV-based $M_*$ ($\pm 0.4$ dex) |
| DJA NIRSpec Merged v4.4 | Cross-Survey Spectroscopic Validation | 19,445 unique (grade $\ge 3$); 698 at $z > 7$; 234 at $z > 8$ | $z = 0.1$–$14.1$ | None | Brammer et al. (DJA); de Graaff et al. (2024) | Photometric $M_*$ from grizli; heterogeneous survey depths |
| UNCOVER DR4 SPS (MegaScience) | Primary + Spectroscopic Validation | 2,628 (z$>$4, Prospector-β); 203 with spec-z fixed fits | $z = 4$–$12$ | Abell 2744 (lensed) | Wang et al. 2024; Suess et al. 2024; Price et al. 2025 | 20-band photometry; lensing magnification corrections |
| COSMOS2025 (LePHARE SED) | Cross-Field Replication | 48,861 (z$>$4, adopted LePHARE selection); 7,249 at $z > 7$; 2,659 at $z > 8$ | $z = 4$–$13$ | None (blank field) | Shuntov et al. 2025 (COSMOS2025) | LePHARE E(B-V) less precise than Prospector dust2; photo-z scatter |
Related MIRI-supported analyses of Little Red Dots (LRDs) at $z > 4$ find that inferred stellar masses can shift by up to orders of magnitude depending on the assumed AGN contribution. This motivates a conservative stance in the interpretation of compact red sources and provides a systematic-control context for any extreme-mass claims in the literature.
2.2 Key Terminology
The following terms are used consistently throughout this work:
| Term | Symbol | Definition |
|---|---|---|
| Temporal Enhancement Factor | $\Gamma_t$ | The ratio of effective proper time to cosmic time experienced by stellar populations. In unscreened regions (low density), $\Gamma_t$ increases with potential depth: massive halos have $\Gamma_t > 1$ (enhancement), low-mass halos have $\Gamma_t < 1$ (suppression). In screened regions (density $> \rho_c$), $\Gamma_t \to 1$ regardless of potential depth. |
| Temporal Shear | — | The spatial gradient of $\Gamma_t$ across a galaxy or environment. Differential temporal shear refers to the difference in $\Gamma_t$ between two regions (e.g., galactic center vs. halo). |
| Isochrony Bias | — | The systematic error in inferred stellar properties (mass, age, SFR) arising from the assumption that stellar clocks tick at the cosmic rate everywhere. Under TEP, this assumption is violated in deep potential wells. |
| Screening | — | The suppression of TEP effects in regions where the local density exceeds a critical threshold ($\rho_c \approx 20$ g/cm³), restoring standard GR physics. Two types are distinguished: Core Screening—Screening within a single galaxy, where the deep central potential suppresses TEP ($\Gamma_t \to 1$) while the outskirts remain enhanced. Produces bluer cores and redder outskirts. Environmental Screening—Screening by the ambient group or cluster potential, causing galaxies in dense environments to appear younger than isolated field galaxies of the same mass. |
| Effective Time | $t_{\rm eff}$ | The proper time experienced by stellar populations: $t_{\rm eff} = t_{\rm cosmic} \times \Gamma_t$. |
2.3 Derived quantities
2.3.1 Halo mass proxy
For each galaxy, the analysis uses an abundance-matching relation (Behroozi et al. 2019) to map stellar mass to halo mass. This mapping is used solely to construct $\Delta\log(M_h)$ for the TEP parameterization. To mitigate circularity, sensitivity tests are performed with $\pm 0.3$ dex scatter in the $M_h-M_*$ relation, propagating to $\pm 12\%$ in $\Gamma_t$ corrections.
2.3.2 The TEP Metric Coupling
The temporal enhancement factor $\Gamma_t$ is not introduced here as an ad hoc fitting function. It is the observable mapping of a conformally coupled scalar-tensor framework in which a chameleon field modifies the local rate at which material clocks accumulate proper time. The full theoretical development is extensive; this section states only the steps needed to connect the action-level construction to the measurable quantities used in the present analysis.
2.3.2.1 From Action to Observable
The TEP framework builds upon chameleon-class scalar-tensor theories (Khoury & Weltman 2004; Brax et al. 2004; Burrage & Sakstein 2018) where the effective mass of the scalar field depends on the local matter density. The key steps mapping the fundamental physics to the observable $\Gamma_t$ are:
- Action: Matter couples to $\tilde{g}_{\mu\nu} = A(\phi) g_{\mu\nu}$ where $A(\phi) = \exp(2\beta\phi/M_{\rm Pl})$. The Klein-Gordon equation sources $\phi$ from the matter density trace $T^\mu_\mu$.
- Proper time: Clock rates scale as $d\tau/dt \approx A(\phi)^{1/2}$, defining $\Gamma_t \equiv (d\tau/dt)/(d\tau/dt)_{\rm ref}$.
- Halo mapping: In virialized halos, $\phi$ tracks the potential depth $\Phi \propto M_h^{2/3}$, yielding:
where $\Delta\log(M_h) = \log M_h - \log M_{h,\rm ref}(z)$, the reference halo mass evolves as $\log M_{h,\rm ref}(z) = 12.0 - 1.5\log_{10}(1+z)$ (fixed potential-depth scaling), $z_{\rm ref} = 5.5$, and $\alpha_0 = 0.58 \pm 0.16$ is the external Cepheid prior.
2.3.2.1a Enhancement and Suppression: The Two-Sided Prediction
The exponential formula predicts two physically distinct regimes depending on halo mass relative to the redshift-dependent reference mass $\log M_{h,\rm ref}(z)$:
- Enhancement ($\Gamma_t > 1$, massive halos): For $M_h > M_{\rm ref}$, $\Delta\log M_h > 0$ and $\Gamma_t > 1$. The scalar field $\phi$ is sourced more strongly by the deeper potential, raising $A(\phi) > 1$ and accelerating material clock rates relative to the cosmic mean. This is the regime of the Red Monsters and massive $z > 8$ galaxies.
- Suppression ($\Gamma_t < 1$, low-mass halos): For $M_h < M_{\rm ref}$, $\Delta\log M_h < 0$ and $\Gamma_t < 1$. In shallow potentials, the chameleon field relaxes toward a lower-energy minimum with $A(\phi) < 1$, meaning material clocks tick slower than the cosmic mean. This is not an ad-hoc extension: it follows directly from the same conformal coupling $d\tau/dt \propto A(\phi)^{1/2}$, which is symmetric about the reference environment. The reference mass $\log M_{h,\rm ref}(z) = 12.0 - 1.5\log_{10}(1+z)$ defines the environment where $A(\phi) = 1$ (i.e., $\phi = 0$ in the Einstein frame), and deviations in either direction produce proportional clock-rate shifts.
This two-sided prediction is central to the interpretation: the "Uniformity Paradox" — why low-mass galaxies at $z > 8$ are dust-poor despite cosmic time being nominally sufficient for AGB production — is resolved because $\Gamma_t \ll 1$ in low-mass halos shuts off the effective AGB clock. A model that only predicted enhancement ($\Gamma_t \geq 1$ everywhere) would not explain the dust-poor low-mass population. The suppression regime is therefore a falsifiable prediction rather than a free parameter: it predicts that low-mass galaxies at $z > 8$ should be systematically younger in their stellar populations than their cosmic age implies, and should lack dust regardless of the available cosmic time.
2.3.2.2 Screening and Scale Separation
For a representative concordance-scale coupling $\alpha_0 = 0.548$, the bare Brans-Dicke parameter is $\omega_{\rm BD} = 1/(2\alpha_0^2) - 1/2 \approx 0.99$ — roughly four orders of magnitude below the Cassini bound ($\omega_{\rm BD} > 40{,}000$; Bertotti et al. 2003). This large pre-screening discrepancy illustrates the central logic of chameleon-class theories: the bare coupling is strong, but the thin-shell mechanism suppresses the effective coupling by $\Delta R/R \lesssim 10^{-6}$ in solar-system bodies, yielding $\omega_{\rm BD}^{\rm eff} > 10^6$. On cosmological scales, the Compton wavelength $\lambda_C \sim 1$ Mpc yields Yukawa suppression $\beta_{\rm eff}(R_8) \approx 0.002$ on $\sigma_8$ scales—well below the Planck bound. Within individual halos ($r \lesssim 50$ kpc), the field tracks the local potential and operates locally. This two-scale picture is standard for chameleon theories.
The screening condition for local tests is:
This critical density is derived independently from three sources that converge on the same value: GNSS atomic clock networks ($L_c \approx 4200$ km for Earth's mass), atomic physics (soliton radius $R_{\rm sol}(m_p) \sim a_0$ at the proton mass scale), and magnetar anti-glitches ($P_{\rm crit} \approx 6.8$ s for 1E 2259+586, 4% match). The convergence across 40 orders of magnitude in mass provides an independent consistency check on this screening scale.
At galactic scales, an effective kinematic screening threshold emerges from analysis of 380 millisecond pulsars in globular clusters, which reveals that the TEP spin-down excess saturates for systems with velocity dispersion $\sigma \gtrsim 165$ km/s, consistent with the scalar field entering the thin-shell regime. This threshold is used in §4.4.3 to define the environmental screening boundary for JWST galaxies: halos with $\sigma \gtrsim 165$ km/s (corresponding to $\log M_h \gtrsim 13.5$ at $z \sim 0$) are expected to be partially screened, suppressing $\Gamma_t$ below the unscreened prediction.
2.3.2.3 Enhancement vs. Dilation
Standard GR predicts time dilation in deep potentials; TEP predicts enhancement ($\Gamma_t > 1$). These refer to different metrics: gravitational redshift is governed by $g_{\mu\nu}$ (preserved identically), while material clock rates are governed by $\tilde{g}_{\mu\nu} = A(\phi)g_{\mu\nu}$. The key distinction is that $\Gamma_t$ compares clock rates between different environments at the same epoch, not between positions in a single well. Numerical integration confirms $A(\phi) > 1$ in unscreened halos for $2\beta^2 > 1$. Solar System bodies are fully screened ($\Gamma_t \to 1$).
| Parameter | Value | Source | Description |
|---|---|---|---|
| $\alpha_0$ | $0.58 \pm 0.16$ | External Cepheid Prior | Local-universe coupling strength from Cepheids |
| $z_{\rm ref}$ | 5.5 | TEP-H0 | Reference redshift for calibration |
| $\log M_{h, \rm ref}$ | 12.0 | TEP-COS | Reference halo mass at $z=0$ ($\Gamma_t=1$) |
| $\rho_c$ | 20 g/cm$^3$ | TEP-UCD | Critical density for screening |
The zero-parameter JWST prediction adopts the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$ together with the fixed reference choice $z_{\rm ref} = 5.5$. The high-redshift concordance analyses later recover values near $\alpha_0 \approx 0.55$, but those internal recoveries are treated as consistency checks rather than as the input calibration. This leaves zero parameters tuned to the JWST data.
2.3.2.7 Cosmological Viability Summary
The TEP framework has been checked against three classes of precision cosmological constraints:
- BBN: During the radiation era, $T^\mu_\mu \approx 0$ for relativistic species, so the scalar field driving force vanishes and $\phi$ remains frozen. Numerical integration yields $|\Delta H/H|_{\rm max} = 1.7 \times 10^{-13}$ and $\Delta Y_p < 10^{-14}$—shifts $\sim 10^{12}$ times below observational uncertainties (Appendix A.1.6).
- Linear growth ($\sigma_8$): Yukawa suppression on $\gtrsim 10$ Mpc scales reduces the effective coupling to $\beta_{\rm eff} \lesssim 0.002$, preserving $\Lambda$CDM-consistent $\sigma_8$ (§2.3.2.2; Appendix A.1.7–A.1.8).
- Solar System (PPN): Thin-shell screening suppresses the effective coupling by $\Delta R/R \lesssim 10^{-6}$ for solar-system bodies, satisfying Cassini and lunar laser ranging bounds (Appendix A.1.3).
Scale-dependent growth computation: To go beyond the analytic Yukawa argument, the linear growth ODE is solved independently for each Fourier mode $k$, incorporating the full scale-dependent gravitational coupling $G_{\rm eff}(k,z)/G_N = 1 + 2\beta^2 k^2/(k^2 + m_\phi(z)^2)$, where $m_\phi(z) = m_{\phi,0}(1+z)^{9/4}$ for $n=1$ chameleon. The resulting matter power spectrum ratio $P_{\rm TEP}(k)/P_{\Lambda{\rm CDM}}(k)$ and integrated $\sigma_8$ are computed self-consistently (Appendix A.1.8). Key results:
- Planck consistency requires $m_{\phi,0} \gtrsim 0.2\,h$/Mpc ($\lambda_C \lesssim 31\,h^{-1}$ Mpc at $z=0$; Appendix A.1.8).
- For typical chameleon parameters, $\beta_{\rm eff}$ on $R_8$ scales is $\approx 0.0046$—a factor $\sim 126$ below the bare coupling ($G_{\rm eff}/G_N - 1 = 4.1 \times 10^{-5}$ at $k_8$; Appendix A.1.8).
- The predicted $\sigma_8^{\rm TEP} = 0.8110$ vs. $\sigma_8^{\Lambda{\rm CDM}} = 0.811$; $\Delta\sigma_8 = 7.2 \times 10^{-8}$ ($1.2 \times 10^{-5}\sigma$). RSD: $\Delta\chi^2 = 8 \times 10^{-5}$ across 8 data points (Appendix A.1.8).
This k-dependent growth calculation substantially strengthens the viability argument beyond the earlier analytic estimate. A full CAMB Boltzmann integration (Appendix A.1.8.8) confirms these results: at the fiducial $m_{\phi,0} = 1.0\,h$/Mpc, the CAMB-computed $\sigma_8^{\rm TEP} = 0.8116$ ($0.10\sigma$ from Planck), with CMB TT deviations $< 0.02\%$ at all $\ell < 2500$ and $\chi^2/{\rm dof} \ll 1$ against Planck error bars. Planck consistency holds for $m_{\phi,0} \gtrsim 0.2\,h$/Mpc. The CAMB integration uses the scale-dependent growth equation with Yukawa-suppressed $G_{\rm eff}(k,z)$ and CAMB's exact lensed CMB spectra, substantially narrowing the remaining theoretical gap. The remaining approximation — that the scalar field does not modify acoustic peaks at $z > 1089$ — is justified by $T^\mu_\mu \approx 0$ during radiation domination. A natively coupled hi_class integration remains desirable for completeness but is no longer expected to change the conclusion.
2.3.3 Effective time and isochrony bias correction
An effective time is defined as $t_{\rm eff} = t_{\rm cosmic}\,\Gamma_t$, where $t_{\rm cosmic}$ is computed from a fiducial cosmology (Planck18). Under the isochrony-bias model used here, the mass-to-light ratio is assumed to scale as $M/L \propto t^n$ (following standard SSP predictions; Bruzual & Charlot 2003; Conroy et al. 2009). Forward-modeling analysis finds that $n \approx 0.5$ minimizes the residual mass-age correlation at $z > 6$, while $n \approx 0.9$ is preferred at $z = 4$–$6$. For the primary high-$z$ analysis, $n = 0.5$ is adopted. The corrected stellar mass and SFE are:
2.4 Statistical procedures
The statistical design separates three questions: whether the predicted associations are present, whether they survive control for obvious confounders, and whether they generalize across surveys and subsamples. Associations are quantified using Spearman rank correlations and bootstrap confidence intervals. To address confounding by redshift and stellar mass, partial-correlation analyses implemented via residualization are employed. In addition to correlation-based tests, the following are reported:
- Stratified comparisons (e.g., high vs low $\Gamma_t$ splits) for multi-property coherence
- Distributional comparisons (e.g., Kolmogorov-Smirnov tests) for regime separation
- Model comparison using both AIC/BIC and nested Bayesian evidence: the regression comparisons test predictors {z}, {z, $\log M_*$}, {z, $\Gamma_t$}, and {z, $\log M_*$, $\Gamma_t$}, while a separate
dynestynested-sampling analysis compares TEP against explicit bursty-SF, varying-IMF, standard-physics, and AGN alternatives in both raw standardized space and a mass+$z$-residualized control space
2.4.1 Combined significance and multiple testing
Combined significance is assessed using Fisher's method, Bonferroni correction, Brown's method (dependence-adjusted), harmonic mean p-value, and Benjamini-Hochberg FDR ($\alpha = 0.05$). Because omnibus significance depends sensitively on how clustering and shared predictors are penalized, the manuscript treats the three-survey photometric L1 replication as the headline result and uses broader multi-test combinations as supportive context. An extreme stress test that reduces effective sample sizes by 90% via spatial clustering autocorrelation still leaves a Bonferroni-corrected floor of $3.3\sigma$ for the mixed battery. Parametric p-values are supplemented by permutation tests ($N = 10{,}000$ shuffles) and bootstrap confidence intervals ($N = 10{,}000$ resamples). Cross-survey effect sizes are combined via DerSimonian-Laird random-effects meta-analysis with $I^2$ heterogeneity assessment and leave-one-out influence diagnostics.
2.4.2 Blind validation protocol
Three split strategies test generalization: (1) time-split (low-$z$ train / high-$z$ test, 60/40); (2) field-split (RA median); (3) cross-survey leave-one-out. A test passes if the dust–$\Gamma_t$ correlation remains significant ($p < 0.05$) on held-out data.
2.4.3 Stellar-to-halo mass mapping and sensitivity
Each galaxy's stellar mass is mapped to halo mass via abundance matching ($\log M_h \approx \log M_* + 2.0$). Key results survive $\pm 0.5$ dex perturbations. To test robustness against MIRI-based mass recalibration (Pérez-González et al. 2024), mass reductions of 0.0–1.0 dex are applied; TEP signatures persist at $> 4\sigma$ under a 0.5 dex reduction. At $z > 8$, selection bias toward bright galaxies is quantified via Monte Carlo completeness weighting ($N = 1{,}000$ iterations) and Savage-Dickey Bayes Factors.
2.4.4 Forward-modeling validation
The $M/L \propto t^{n}$ scaling is validated by varying $n = 0.5$–$0.9$ and identifying the value minimizing the residual mass-age correlation. At $z > 6$, $n \approx 0.5$ is preferred (consistent with low-metallicity SSP models); at $z = 4$–$6$, $n \approx 0.9$.
2.5 Black Hole Growth Simulation
To test the "Little Red Dot" resolution, a differential temporal shear simulation was developed. A compact galaxy ($r_e \approx 150$ pc) with a baryon-dominated core ($c=10$) is modeled. The local temporal enhancement factor $\Gamma_t(r)$ is computed at the center (Black Hole environment) and at the effective radius (Stellar environment) across the redshift range $z=4$–$10$.
The differential growth factor is computed as:
where $t_{\rm Salpeter} \approx 45$ Myr is the Salpeter timescale (e-folding time for Eddington-limited accretion). This simulation uses the same fiducial $\alpha_0 \approx 0.55$ scale implied by the external Cepheid prior, with no additional tuning.
2.6 Reproducibility
All analyses are reproducible from the public repository. An end-to-end run regenerates the manuscript tables, figures, and archived outputs; execution instructions are provided in the repository README.
3. Results
3.0 Evidence Summary
The results are organized by evidential role. Two branches are counted as primary empirical lines: L1, the dust–$\Gamma_t$ relation together with the AGB-threshold test, and L3, the mass–sSFR inversion. L2 remains an ancillary spatial indication: it is materially strengthened by the preferred JADES DR5 direct-mass morphology sample, but the direct gradient discriminator is still not decisive. L4 remains a derived regime-level comparison anchored by kinematic information rather than by photometric evidence alone; the new five-object supplementary object-level branch materially strengthens the kinematic rebuttal to the mass-proxy objection without yet promoting L4 to a primary line.
Independence disclosure: L1 (dust–$\Gamma_t$ correlation) and the AGB threshold test share the same predictor because $t_{\rm eff}$ is a deterministic function of $\Gamma_t$. Their predictor residual after controlling for $\Gamma_t$ is only $\rho = 0.049$ ($p = 0.41$), so they are counted as one empirical line expressed in two complementary ways. Cross-survey generalization (formerly L5) therefore remains a robustness check on L1, not a new line, and the age-coherence and metallicity branches are not counted because they vanish under joint mass+redshift control.
Several additional branches remain informative but are not promoted in the evidence count. Environmental screening is mixed: the full-sample field-versus-dense split is suggestive, but the targeted $z > 8$ contrast is null ($\Delta\rho = 0.003$, $p = 0.97$), and the rebuilt DJA/JADES protocluster companion on real DJA root fields is likewise mixed/null ($\Delta_{\rm dense-field}=+0.086$, 95% CI $[-0.019,+0.188]$). The screened DJA pilot attached to L4 is directionally supportive — within the quality-screened subset, the Balmer dust proxy tracks fitted emission-line width more strongly than photometric stellar mass after mass+$z$ control ($\rho_{\rm partial}=+0.887$, $p=0.045$, $N=7$), while the competing mass-given-$\sigma$ partial is weak and negative — but because the pilot still relies on grating-fallback instrumental resolution rather than per-pixel $R$, it is retained as supportive context rather than as a standalone counted line.
| Signature | Finding | Significant | Survives Mass Control | Status |
|---|---|---|---|---|
| L1. Dust–$\Gamma_t$ correlation + AGB threshold | $\rho = +0.62$, $N=1{,}283$, three surveys; partial $\rho = +0.26$ after full polynomial control; dedicated UNCOVER $z > 8$ battery passes 4/4, with $\rho(t_{\rm eff}, A_V \mid t_{\rm cosmic}) = +0.600$; AGB step odds ratio 42.8 ($p \approx 10^{-40}$); $\Delta$AIC $\approx -4.8$ vs mass-matched threshold | ✔ | ✔ | Independent line |
| L2. Inside-out core screening | The preferred JADES DR5 direct-mass morphology branch gives four supportive structural proxies after mass+$z$ control ($r_{\rm half,F277W/F444W}$ partial $\rho=-0.256$, $p=3.7\times10^{-7}$; Gini $=+0.361$, $p=2.9\times10^{-13}$; $\sigma_\star=+0.624$, $p=7.0\times10^{-43}$; $N=384$), while the resolved-gradient branch retains the raw mass trend ($\rho=-0.166$, $p=5.7\times10^{-3}$) and a directionally supportive debiased q33/q67 sign test (negative-gradient fraction $0.581$ vs $0.457$, Fisher $p=0.061$). | ✔ | ✘ | Ancillary spatial indication |
| L3. Mass–sSFR inversion | Correlation inverts at $z>7$ ($\Delta\rho = +0.25$, 95% CI excludes zero); partial $\rho(\Gamma_t, {\rm sSFR}|{\rm dust}) = -0.49$ ($p = 10^{-18}$); the COSMOS2025 blank-field companion is supportive in matched $z = 8$–9 and selection-sensitive rather than cleanly null at $z = 9$–13 | ✔ | ✔ | Independent line |
| L4. Dynamical mass comparison | The matched RUBIES-like regime-level comparison remains the live quantitative anchor; a supplementary five-object direct literature ingestion (Esdaile et al. 2021; Tanaka et al. 2019), including one conservative upper-limit row, gives mean observed excess $0.168$ dex and mean corrected excess $-0.075$ dex on the four-object exact-$M_{\rm dyn}$ subset, resolves 2/3 anomalous exact objects after TEP, and yields a provisional downstream debias bootstrap $\beta = 0.52$ [$0.30$, $1.01$]. A screened DJA pilot spectroscopic-width companion is also directionally supportive: after mass+$z$ control, $\log({\rm H}\alpha/{\rm H}\beta)$ tracks fitted $\sigma$ with partial $\rho = +0.887$ ($p = 0.045$, $N = 7$), while the competing mass-given-$\sigma$ partial is weak and negative; this remains pilot-quality because the public spectra use grating-fallback rather than per-pixel resolution. | ✔ | ✔ | Derived regime comparison — not counted with primary empirical lines |
| L5. Direct Kinematic Test | The JWST-SUSPENSE direct comparison ($N = 15$) provides the primary mass-circularity stress test: $\rho({\rm Age}, \Gamma_t \mid M_*, z)=+0.624$ ($p=0.013$) while $\rho({\rm Age}, M_* \mid \Gamma_t, z)=-0.036$ ($p=0.898$); Monte Carlo preserves a positive $\Gamma_t$ residual in 99.7% of draws. A sigma-only kinematic expansion ($N = 83$, six surveys, $z = 1.2$–$7.6$) breaks the mass-proxy circularity: the TEP-specific $\Gamma_t(\sigma)$ adds significant predictive power for $M_{*,\rm obs}$ beyond $\sigma$ and $z$ individually (partial $\rho = +0.326$, $p = 0.0026$; $z \geq 4$ subset: $\rho = +0.326$, $p = 0.014$, $N = 56$). The federated package comprises three counted supportive branches. | ✔ | ✔ | Direct kinematic test with small-sample caveat |
| Steiger Z-test ($t_{\rm eff}$ vs $M_*$) | $Z=17.8$, $p=1.3\times10^{-70}$; $Z=10.4$, $p=1.8\times10^{-25}$ at $z>8$ | ✔ | ✔ | Robustness check on L1 |
| Partial correlations | $\rho=+0.26$ after full polynomial control; $\rho(t_{\rm eff}, A_V \mid t_{\rm cosmic})=+0.600$; $M_*$ zero residual after $t_{\rm eff}$ control | ✔ | ✔ | Robustness check on L1 |
| Cross-survey generalization | Polynomial $R^2$ collapses to $-6.4$ cross-survey; $t_{\rm eff}$ stable at $\rho=0.60$–0.80 (same dust observable as L1) | ✔ | ✔ | Robustness check on L1 |
| Age coherence | $\rho = +0.14$ (mass-only); vanishes with $M_*$+$z$ control | ✔ | ✘ | Not independent (mass proxy) |
| Metallicity | $\rho = +0.16$ (mass-only); vanishes with $M_*$+$z$ control | ✔ | ✘ | Not independent (mass proxy) |
| Environmental screening / protocluster switch | Full-sample field-vs-dense split gives $\Delta\rho = +0.19$ ($Z=4.68$, $p=2.9\times10^{-6}$), but the targeted $z>8$ contrast is null ($\Delta\rho = +0.003$, $p=0.97$); the rebuilt DJA/JADES sign-reversal companion on real DJA root fields is mixed/null, with matched $\beta$ residual $\Delta_{\rm dense-field}=+0.086$ and 95% CI $[-0.019,+0.188]$ | ✔ | ✔ | Supplementary consistency check — mixed / not counted |
| Colour-gradient sign test | Real JADES resolved photometry retains the raw mass-gradient trend ($\rho(M_*, \nabla_{\rm color}) = -0.166$, $p = 5.7\times10^{-3}$), and the raw $\Gamma_t$-gradient correlation remains weak ($\rho = -0.105$, $p = 8.1\times10^{-2}$; Steiger $Z = 1.92$, $p = 0.055$). The direct partial remains null under both observed-mass and debiased-mass control ($\rho = +0.011$, $p = 0.85$; $\rho = -0.015$, $p = 0.80$), but the debiased q33/q67 sign test is directionally TEP-consistent (negative-gradient fraction $0.581$ vs $0.457$, Fisher $p=0.061$, $\Delta=-0.063$) while the literal $\Gamma_t > 1$ tail remains underpowered | ✔ | ✘ | Provisional ancillary follow-up — not counted |
3.1 Red Monsters: A Zero-Parameter Prediction
The TEP parameterization is applied to galaxies in the Red Monster regime ($z \sim 5$–$6$, $\log M_* \gtrsim 10.5$; Xiao et al. 2024). This is a blind prediction: the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$ is fixed entirely from local Cepheid data ($z \approx 0$) before the high-redshift fit. The later high-redshift concordance analyses recover values near $\alpha_0 \approx 0.55$, but these are internal consistency checks rather than the input prior. No parameters are fitted or tuned to the high-redshift observations. The three entries below (S1–S3) use representative parameters spanning the published range (§2.1.1); the predicted correction depends primarily on $\Gamma_t$ and is insensitive to the exact input SFE.
Because the sample contains only three objects, the Red Monster calculation is best read as an illustrative zero-parameter case study rather than as a standalone statistical test. The primary statistical weight comes from the population-level analyses ($N = 2{,}315$), and the externally calibrated prediction is further checked across three surveys ($N = 1{,}283$ at $z > 8$).
| ID | $z$ | $\alpha(z)$ | $\Gamma_t$ (Predicted) | SFE$_{\rm obs}$ | SFE$_{\rm true}$ | % Anomaly Resolved |
|---|---|---|---|---|---|---|
| S1 | 5.85 | 1.52 | 12.98 | 0.50 | 0.08 | 100% |
| S2 | 5.30 | 1.46 | 7.55 | 0.50 | 0.12 | 100% |
| S3 | 5.55 | 1.48 | 7.53 | 0.50 | 0.12 | 100% |
| Average Prediction | 9.35 | 0.50 | 0.11 | 100% | ||
The predicted mass bias $\Gamma_t^{n} \approx 4.7$ reduces the corrected SFE to $\sim 0.11$ (well below the standard $\Lambda$CDM limit of 0.20). The anomaly is fully resolved for all three objects. Propagating the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$ uncertainty ($\pm 28\%$) confirms robustness: even at $\alpha_0 = 0.42$ (lower $1\sigma$ bound), the corrected SFE remains well below 0.20 and the anomaly is still fully resolved with zero tuned parameters.
3.3 UNCOVER DR4: Mass-sSFR and Mass-Age Correlations
The Red Monster case study establishes that TEP predicts the correct direction and magnitude of the SFE correction for individual extreme objects. The critical question is whether this signal extends to the full galaxy population. In the UNCOVER DR4 sample ($N = 2{,}315$), the mass-sSFR correlation is weak but significant ($\rho = -0.13$, $p = 1.3 \times 10^{-10}$, Cohen's $d = -0.27$), consistent with TEP partially canceling the intrinsic downsizing trend. The mass-age correlation is positive ($\rho = +0.14$, $p = 7.0 \times 10^{-11}$), consistent with more massive galaxies experiencing more proper time. Both correlations are in the predicted direction but are attenuated by the full redshift range; the signal sharpens substantially when the sample is stratified by redshift.
3.3 Redshift Evolution: The High-z Transition
TEP predicts that the mass-sSFR correlation should become less negative (or even positive) at higher redshift, where the TEP enhancement is stronger. This is tested by stratifying the sample:
| $z$ Range | $N$ | Spearman $\rho$ | 95% CI | Interpretation |
|---|---|---|---|---|
| 4–5 | 942 | $-0.17$ | [$-0.24$, $-0.11$] | Standard downsizing |
| 5–6 | 497 | $-0.14$ | [$-0.22$, $-0.05$] | Standard downsizing |
| 6–7 | 372 | $-0.06$ | [$-0.16$, $+0.04$] | Weakening |
| 7–8 | 221 | $+0.18$ | [$+0.05$, $+0.31$] | Inversion |
| 8–9 | 179 | $+0.13$ | [$-0.03$, $+0.29$] | Weak positive |
| 9–10 | 104 | $-0.27$ | [$-0.47$, $-0.05$] | Reversal (selection effects) |
Comparing low-$z$ ($4 < z < 6$, $\rho = -0.16$) to high-$z$ ($z > 7$, $\rho = +0.09$): $\Delta\rho = +0.25$ [+0.14, +0.35] (95% CI excludes zero), indicating a statistically significant inversion.
3.4 Partial Correlation Test
The redshift evolution in §3.3 is consistent with TEP, but by itself it does not eliminate the mass-proxy concern, since $\Gamma_t \propto M_h^{1/3}$. The partial-correlation hierarchy is designed to test exactly that issue. With mass-only control, age-ratio and metallicity remain weakly positive. With joint mass+redshift control they become consistent with zero, so they are treated as mass-proxy-adjacent rather than independent. The high-redshift dust signal behaves differently: at $z > 8$, the dust–$\Gamma_t$ correlation survives ($\rho = +0.262$, $p = 8.1 \times 10^{-6}$, Cohen's $d = +0.55$), indicating that $\Gamma_t$ carries information about dust beyond mass alone. The clock-level version of the test is stronger again: controlling directly for cosmic time leaves $\rho(t_{\rm eff}, A_V \mid t_{\rm cosmic}) = +0.600$ ($p = 5.0 \times 10^{-29}$), so the signal is not a trivial restatement of redshift ordering.
The key asymmetry: $t_{\rm eff}$ subsumes the mass information relevant to dust ($M_*$ residual after $t_{\rm eff}$ control: $\rho = -0.006$, $p = 0.92$), but mass does not subsume $t_{\rm eff}$ (residual $\rho = +0.26$, $p = 7.4 \times 10^{-6}$). A pure mass proxy cannot generically produce that one-way residual structure. If $\Gamma_t$ were only a reparameterisation of $M_*$, the relationship would be symmetric and neither predictor would retain residual information once the other was controlled.
A further complication is that the strongest mass-proxy objection becomes self-defeating once the isochrony-bias mechanism is taken seriously. If TEP is correct, SED-inferred stellar masses are themselves biased upward by $\Gamma_t^{0.7}$ (§4.4.6.3). Partial-correlation tests that control for observed $M_*$ therefore over-control the true signal by removing TEP-predicted variance. The partial correlations reported here are accordingly conservative lower bounds, understated by a factor of $\sim 2.5$ at $z > 8$. The cleanest route around that circularity is L4 (dynamical masses; §3.9), where kinematic observables replace SED mass estimates.
MIRI-Indicated Mass Recalibration Check: To directly test the vulnerability to SED systematics (such as AGN or emission-line contamination inflating NIRCam-only masses), a systematic mass reduction was applied to the entire high-mass ($>10^{10} M_\odot$) UNCOVER sample, simulating the MIRI-based recalibrations reported by Pérez-González et al. (2024). Even when all masses are artificially reduced by 0.5 dex, the $\Gamma_t$-dust correlation at $z > 8$ remains completely robust ($\rho = +0.599$, $p = 5.8 \times 10^{-29}$). The signal survives even a full 1.0 dex systematic reduction ($\rho = +0.598$, $p = 6.9 \times 10^{-29}$). This confirms that the TEP signal is driven by the relative rank-ordering of galaxies within the deep potential regime, not by the calibrated photometric mass calibration, and is therefore structurally immune to the primary MIRI systematic critique.
3.5 Screening Signatures
A distinctive feature of the TEP framework — one that distinguishes it from any smooth mass-dependent function — is the screening prediction: above a critical density $\rho_c \approx 20$ g/cm³, the scalar field is suppressed and $\Gamma_t \to 1$. Paper 11 (TEP-COS) established an effective kinematic screening threshold at $\sigma > 165$ km/s from globular cluster pulsar timing. At high redshift, this threshold shifts to higher halo mass. Screening is tested by comparing age ratios (MWA/$t_{\rm cosmic}$) across mass bins:
| $\log M_h$ | $N$ | $\langle$MWA/$t_{\rm cosmic}\rangle$ | $\Gamma_t$ Predicted |
|---|---|---|---|
| 10–11 | 390 | $0.15 \pm 0.003$ | $\sim 0$ (reference) |
| 11–12 | 42 | $0.18 \pm 0.015$ | 0.2–0.5 |
| 12–12.5 | 3 | $0.30 \pm 0.12$ | 1.0–1.5 |
| 12.5–13 | 1 | $0.05$ | 1.5–2.0 |
3.5.1 Resolved Core Screening
TEP predicts that deep core potentials should screen the scalar field locally while outskirts remain enhanced, producing a structurally concentrated, bluer-core signature in massive galaxies. The strongest live L2 support now comes from the preferred JADES DR5 direct-mass morphology sample: after controlling for mass and redshift, four structural proxies remain supportive in the expected direction, with $r_{\rm half,F277W}$ and $r_{\rm half,F444W}$ partial $\rho = -0.256$ ($p = 3.7 \times 10^{-7}$), Gini partial $\rho = +0.361$ ($p = 2.9 \times 10^{-13}$), and $\sigma_\star$ partial $\rho = +0.624$ ($p = 7.0 \times 10^{-43}$) for $N = 384$. The resolved colour-gradient branch remains informative but weaker: for $N = 277$ galaxies it still shows the raw mass-gradient trend $\rho(M_*, \nabla_{\rm Color}) = -0.166$ ($p = 5.7 \times 10^{-3}$), while the direct $\Gamma_t$ correlation is $\rho(\Gamma_t, \nabla_{\rm Color}) = -0.105$ ($p = 8.1 \times 10^{-2}$). The direct partial remains null under both observed-mass and debiased-mass control ($\rho = +0.011$, $p = 0.85$; $\rho = -0.015$, $p = 0.80$). The literal $\Gamma_t > 1$ tail is too small to decide the sign-reversal test cleanly, but after the L4-motivated debiased-mass control the q33/q67 high-versus-low screening split becomes directionally supportive: the negative-gradient fraction rises from $0.457$ to $0.581$ (Fisher $p = 0.061$) with mean contrast $\Delta = -0.063$. The spatial-screening branch is therefore retained as an ancillary indication rather than counted among the two primary statistical lines. See §4.4.3 and the robustness checks note in §3.9 for full details.
3.6 The z > 8 Dust Anomaly: Correlation vs. Budget
The mass–sSFR inversion (§3.3) and the partial-correlation hierarchy (§3.4) show that $\Gamma_t$ carries information beyond a simple mass trend. The clearest physical test, however, is dust production. Can the observed dust reservoirs at $z > 8$ be produced in the available time? Under standard physics, the universe at $z \sim 9$ is only $\sim 540$ Myr old, barely enough for the first generation of AGB stars to complete their evolution. Quantitative analysis using canonical dust parameters (AGB delay $\sim 500$ Myr, standard ISM opacity) therefore exposes a direct tension between the observed dust reservoirs and the standard-time budget.
Dust Budget Analysis ($N=33$ massive galaxies at $z > 8$)
Comparing observed dust masses to the maximum theoretical yield under canonical assumptions:
| Framework | Mean Deficit Ratio | "Yield Violation" Candidates ($> 2\times$ Limit) |
|---|---|---|
| Standard Physics ($t = t_{\rm cosmic}$) | 0.91$\times$ (Saturation) | 8 / 33 (24%) |
| TEP ($t = \Gamma_t t_{\rm cosmic}$) | 0.41$\times$ (Comfortable) | 0 / 33 (0%) |
Under standard physics, the average massive galaxy is near the theoretical production limit, with ~24% of the sample requiring unphysical yields. Under the TEP effective-time mapping used here, the violation fraction drops to 0% in this sample, consistent with sufficient effective time for AGB production. Recent JWST spectroscopy shows that AGB stars produce SiC and iron dust even at low metallicity ($\sim 1$–$7\%\,Z_\odot$; Boyer et al. 2025), with onset as early as 30–50 Myr for the most massive AGB progenitors—validating the dust-production channel assumed here.
The "Optimistic" Trap. One might attempt to resolve the standard-physics deficit by assuming optimistic parameters — maximal supernova yields, minimal destruction, accelerated AGB onset. While this can technically close the budget (reducing the violation fraction to 0%), it creates a deeper problem: the Uniformity Paradox. If parameters are tuned to allow dust everywhere (since $t_{\rm cosmic}$ is uniform), dust should be ubiquitous or track star formation. Instead, observations reveal a strong mass-dependent suppression ($\rho = +0.56$): massive galaxies are dusty; low-mass galaxies are dust-poor. No tuning of a time-uniform parameter can reproduce a mass-dependent gradient. Under TEP, this gradient arises naturally: the framework suppresses effective time in low-mass halos ($\Gamma_t \ll 1 \rightarrow t_{\rm eff} \ll 300$ Myr), shutting off the AGB channel, while in massive halos $\Gamma_t > 1$ ensures it remains open. The anomaly is not that massive galaxies have dust — it is that low-mass galaxies don't, in a pattern that tracks gravitational potential depth.
Figure 3: The Dust Saturation Crisis. The ratio of observed dust mass to the maximum theoretical yield is plotted for massive galaxies at $z > 8$. Standard Physics (blue) places the population near the saturation limit (100% of yield), leaving no margin for error. TEP (orange) shifts the population to approximately 40% of the limit. While standard physics is technically possible, it requires near-maximal efficiency across all parameters simultaneously; TEP requires only typical efficiencies.
3.6.1 The $z = 6$–$7$ Dip: Quantitative Forensics
The negative mass-dust correlation at $z = 6$–$7$ ($\rho = -0.12$, $p < 0.05$; Cohen's $d = -0.24$, small effect) breaks the expected monotonic strengthening toward $z > 8$. Rather than speculate, quantitative forensics are performed to identify the physical mechanism.
Three hypotheses are tested:
- sSFR-Driven Dust Destruction: High specific star formation rates drive supernova rates that disrupt dust faster than it can accumulate.
- Sample Composition: The $z = 6$–$7$ bin may have systematically different mass/dust distributions.
- Selection Effects: UV-bright (low-dust) massive galaxies may be preferentially detected.
| $z$ Range | $\rho$(sSFR, $A_V$) | Massive Fraction | Dusty Fraction | $\rho(\Gamma_t, A_V | M_*)$ |
|---|---|---|---|---|
| 4–5 | $-0.03$ | 13.2% | 1.7% | $-0.05$ |
| 5–6 | $-0.04$ | 12.5% | 3.2% | $-0.21$ |
| 6–7 | $\mathbf{-0.34}$ | 7.5% | 1.6% | $+0.05$ |
| 7–8 | $-0.18$ | 6.3% | 10.9% | $-0.13$ |
| 8–10 | $-0.22$ | 11.7% | 26.2% | $+0.27$ |
Primary Mechanism: sSFR-Driven Dust Destruction
The $z = 6$–$7$ bin shows the strongest sSFR-dust anticorrelation of any redshift bin ($\rho = -0.34$, vs $\rho \approx -0.03$ at $z = 4$–$5$). This indicates that galaxies with high specific star formation rates are actively depleting dust through supernova shocks faster than AGB stars can replenish it—and this effect is maximally expressed at $z \sim 6$–$7$. The sSFR-dust anticorrelation peaks at $z=6$–$7$ ($\rho = -0.34$, Cohen's $d = -0.72$, medium effect), significantly stronger than at $z>8$ ($\rho = -0.22$) or $z<6$ ($\rho \approx -0.03$). This suggests that supernova-driven dust destruction maximally outpaces production during this epoch.
At $z \sim 6.5$, the universe is $\sim 840$ Myr old — long enough for the first generation of AGB stars to begin producing dust, but short enough that ongoing starbursts generate high supernova rates. This creates a transient "competition epoch" in which destruction outpaces production. At $z \gtrsim 8$, the cosmic timeline is so compressed that only halos with $\Gamma_t > 1$ have accumulated sufficient effective time for AGB dust production, while low-$\Gamma_t$ halos do not — restoring the positive mass-dust correlation and explaining why the signal strengthens precisely at the epoch where standard physics predicts it should be absent.
A critical test of any claimed physical effect is independent replication across datasets with different systematic biases. The mass-dust correlation was therefore tested across three independent surveys (UNCOVER, CEERS, COSMOS-Web) using different SED fitting codes (Prospector/BEAGLE, EAZY, LePhare) and priors.
3.7 Cross-Survey Replication and Meta-Analysis
3.7.1 Cross-Code Robustness
The $z > 8$ dust-$\Gamma_t$ correlation is detected in all three datasets despite differences in methodology:
| Survey | Code | $N$ (z > 8) | $\rho(\Gamma_t, \text{Dust})$ | 95% CI | $p$-value | Significance |
|---|---|---|---|---|---|---|
| UNCOVER | Prospector/BEAGLE | 283 | $+0.59$ | $[+0.51, +0.66]$ | $p = 3.0 \times 10^{-28}$ | $11.4\sigma$ |
| CEERS | EAZY | 82 | $+0.66$ | $[+0.52, +0.77]$ | $p = 1.5 \times 10^{-11}$ | $7.0\sigma$ |
| COSMOS-Web | LePhare | 918 | $+0.63$ | $[+0.59, +0.67]$ | $p = 3.5 \times 10^{-102}$ | $22.4\sigma$ |
| Fixed-effects meta | $+0.62$ | $[+0.59, +0.66]$ | $p = 1.0 \times 10^{-149}$ | $26.1\sigma$ | ||
3.7.2 Meta-Analysis
Combining all three surveys yields a combined sample of 1,283 galaxies at $z > 8$ with detected dust. A fixed-effects meta-analysis gives a combined correlation of $\rho = +0.62$ $[+0.59, +0.66]$ with $p = 1.0 \times 10^{-149}$. Heterogeneity is low ($I^2 = 0\%$; Cochran's $Q = 1.04$, $p_Q = 0.60$), indicating consistent effect sizes across surveys. The corresponding Cohen's $d = 1.59$ (large effect), computed from the combined $\rho$ via $d = 2\rho/\sqrt{1-\rho^2}$.
Random-effects meta-analysis: To relax the assumption of a common true effect size, a DerSimonian-Laird random-effects model yields $\rho_{\rm RE} = +0.623$ $[+0.558, +0.656]$, virtually identical to the fixed-effects estimate. Between-study heterogeneity is negligible ($I^2 = 0\%$; Cochran's $Q = 1.04$, $p_Q = 0.60$). A leave-one-out influence analysis shows that no single survey drives the combined result.
Significance bookkeeping: elsewhere in the manuscript the headline cross-survey replication statistic is the more conservative three-survey Fisher combination, $z = 24.4\sigma$. The larger value implied by the fixed-effects meta-analysis here reflects a different pooling method applied to the per-survey effect estimates rather than a contradiction in the underlying measurements.
Mass-stratified confirmation: To test whether the dust–$\Gamma_t$ correlation is an artifact of mass confounding, the combined $z > 8$ sample ($N = 1{,}283$) is split into mass bins of $0.25$ dex width. The correlation is detected in the lowest mass bin ($\log M_* \sim 8.1$: $\rho = +0.28$, $p = 0.002$, $N = 118$) and re-emerges strongly at high mass ($\log M_* \sim 10.1$–$10.4$: $\rho = +0.45$, $p < 10^{-4}$, $N \sim 100$), indicating that the signal persists at fixed mass and is not driven by the mass–dust scaling alone.
3.7.3 Temporal Inversion & AGB Threshold
A more physically targeted and falsifiable test compares dust against cosmic time ($t_{\rm cosmic}$) versus the TEP-effective clock ($t_{\rm eff} = \Gamma_t\,t_{\rm cosmic}$). Under standard physics, dust should track $t_{\rm cosmic}$; under TEP, dust emergence should be organized by $t_{\rm eff}$ and should show a step-like transition near the AGB dust-production timescale ($t_{\rm eff} \gtrsim 0.3$ Gyr).
| Survey | $\Delta\rho = \rho(t_{\rm eff}, A_V) - \rho(t_{\rm cosmic}, A_V)$ | Dust ratio ($t_{\rm eff} > 0.3$ Gyr) | $p$ (threshold) |
|---|---|---|---|
| UNCOVER | $+0.605$ | $2.04\times$ | $4.8 \times 10^{-15}$ |
| CEERS | $+0.711$ | $3.48\times$ | $1.2 \times 10^{-7}$ |
| COSMOS-Web | $+0.862$ | $2.15\times$ | $1.5 \times 10^{-11}$ |
To test whether the location of the step is being tuned to a particular survey, a leave-one-survey-out holdout validation is performed. The threshold selected on the training surveys has median $t_{\rm eff} = 1.93$ Gyr (range $0.06$–$1.93$ Gyr). Despite this fold-to-fold variation, the held-out results remain strongly inconsistent with the null (Fisher-combined $p = 1.1 \times 10^{-25}$). Using the fixed AGB-motivated threshold $t_{\rm eff} > 0.3$ Gyr yields a Fisher-combined $p = 1.5 \times 10^{-252}$.
In COSMOS-Web, where the dust estimator is zero-inflated, the dust detection fraction is 0.73 above threshold versus 0.09 below threshold (Fisher exact test; p-value $< 10^{-10}$). An independent combined-survey threshold scan ($N = 2{,}971$) confirms the transition: the optimal $t_{\rm eff}$ threshold is $1.9 \pm 0.3$ Gyr (bootstrap 16th–84th percentile), with a combined odds ratio of 42.8 ($p \approx 10^{-40}$). This cross-survey temporal-inversion behavior directly tests the core TEP mechanism ($t_{\rm eff}$ controlling dust emergence) and is not a generic "more massive galaxies are dustier" statement.
A dedicated UNCOVER-only validation battery independently passes all four targeted tests: the AGB threshold gives a 2.04$\times$ dust ratio ($p = 4.8 \times 10^{-15}$); controlling for cosmic time leaves $\rho(t_{\rm eff}, A_V \mid t_{\rm cosmic}) = +0.600$ ($p = 5.0 \times 10^{-29}$); the $t_{\rm eff}$–dust correlation remains positive in both low- and high-mass halves ($\rho = +0.39$ and $+0.48$); and the raw mass-dust signal steepens monotonically from $z = 8$–$8.5$ to $z = 9$–$10$ ($\rho = +0.325 \rightarrow +0.716$).
3.7.3.1 AGB Dust Phase Boundary in ($M_*$, $z$) Space
The AGB onset threshold $t_{\rm eff} = 0.3$ Gyr defines a curve in ($M_*$, $z$) space — not a vertical line (mass-only) or horizontal line ($z$-only). Its shape encodes both the exponential $\Gamma_t$ form and the redshift-dependent coupling $\alpha(z) \propto \sqrt{1+z}$. A mass-only threshold cannot replicate this curve.
Using the UNCOVER sample ($N = 2{,}315$) with $A_V > 0.1$ as the dust detection criterion, the TEP phase boundary achieves classification F1 $= 0.742$ (precision $= 0.759$, recall $= 0.725$). Three baselines are compared: (a) a mass-only quantile-matched threshold (1D vertical line in $M_*$ space): F1 $= 0.408$ ($\Delta$F1 $= +0.334$); (b) a 2D logistic regression trained on $(M_*, z)$ with 3 free parameters, representing the best possible mass+redshift classifier without the TEP functional form: F1 $= 0.611$ ($\Delta$F1 $= +0.131$ for TEP over fitted 2D model); (c) a redshift-only step at $z = 8$: F1 $= 0.519$. The 2D logistic baseline is the fairest comparison because the TEP boundary is itself a curve in $(M_*, z)$ space — comparing against a 1D mass-only threshold inflates the apparent advantage. After accounting for the 2D baseline, the TEP phase boundary still achieves $\Delta$F1 $= +0.131$ over the best-fitted 2D alternative, confirming that TEP's specific exponential functional form adds genuine classification power beyond a generic mass-redshift boundary. At $z > 8$: every galaxy above the TEP boundary is dusty (62/62 $= 100\%$), while 88.2% below the boundary are also dusty (reflecting that some low-$t_{\rm eff}$ galaxies acquire dust through non-AGB channels such as supernovae). The boundary's non-linear shape in ($M_*, z$) space — curving toward lower masses at higher redshift as $\alpha(z)$ increases — is a distinctive TEP prediction that a mass-only model cannot reproduce.
3.7.4 The Time-Lens Map: Effective Redshift $z_{\rm eff}$
To express the dust-clock result in a coordinate that is directly comparable across observed redshift, an effective redshift $z_{\rm eff}$ is defined by solving $t_{\rm cosmic}(z_{\rm eff}) = t_{\rm eff} = \Gamma_t\,t_{\rm cosmic}(z_{\rm obs})$. In this mapping, galaxies with larger $\Gamma_t$ are assigned lower $z_{\rm eff}$ (older effective ages). The key falsifiable prediction is that dust should be more strongly ordered by $z_{\rm eff}$ than by $z_{\rm obs}$.
| Survey | $N$ | $\rho(A_V, z_{\rm obs})$ | $p$ | $\rho(A_V, z_{\rm eff})$ | $p$ |
|---|---|---|---|---|---|
| UNCOVER | 283 | $+0.006$ | $0.92$ | $-0.599$ | $6.4 \times 10^{-29}$ |
| CEERS | 82 | $+0.052$ | $0.64$ | $-0.659$ | $1.7 \times 10^{-11}$ |
| COSMOS-Web | 918 | $+0.230$ | $1.8 \times 10^{-12}$ | $-0.631$ | $3.4 \times 10^{-103}$ |
Across surveys, $|\rho(A_V, z_{\rm eff})| > |\rho(A_V, z_{\rm obs})|$. Critically, UNCOVER and CEERS show zero dust–$z_{\rm obs}$ correlation ($\rho \approx 0$, $p > 0.6$), while the TEP effective-time coordinate yields $|\rho| > 0.6$. Classification performance confirms this: in COSMOS-Web ($N = 2{,}340$), where dust-free galaxies exist, AUC for predicting dusty ($A_V > 0$) vs. dust-poor galaxies is $0.92$ for $t_{\rm eff}$ vs. $0.73$ for $t_{\rm cosmic}$ vs. $0.91$ for $M_*$. The combined three-survey AUC is $0.83$ for $t_{\rm eff}$ vs. $0.80$ for $M_*$ vs. $0.72$ for $t_{\rm cosmic}$. (Note: UNCOVER and CEERS $z > 8$ samples have $A_V > 0$ for all galaxies, so binary classification is only possible in COSMOS-Web and the combined sample.)
3.7.5 Functional Form Discrimination
A pure mass proxy makes a specific set of predictions. It should produce dust that increases monotonically with $M_*$ at all redshifts, it should generalize cross-survey because mass is survey-independent, and it should not generate the sign inversion seen in L3. TEP predicts the opposite pattern: little or no dust–mass correlation at $z < 7$, emergence at $z > 8$, and a non-linear AGB threshold that curves in ($M_*, z$) space. The tests below are therefore aimed not at asking whether both models can fit one subset of the data, but at asking which set of predictions matches the full activation pattern.
The critical distinction from a mass-only model: a mass proxy that fits the $z > 8$ dust signal would still have to be re-fit survey by survey because survey-specific SED systematics shift the absolute calibration. By contrast, $\Gamma_t$, calibrated once from local Cepheids, maintains $\rho = 0.60$–$0.80$ across three surveys with no retraining. The Steiger tests below therefore compare not just two correlated predictors, but two different claims about what should remain stable across datasets:
- Within-regime ($z > 8$, $N = 2{,}694$) — primary comparison: $\rho(\text{dust}, t_{\rm eff}) = +0.57$ vs. $\rho(\text{dust}, M_*) = +0.53$; Steiger $Z = 2.4$, $p = 0.016$. This is the honest within-regime comparison: within the high-$z$ subsample where both predictors are operating in their intended domain, $t_{\rm eff}$ adds statistically significant information beyond mass alone. The advantage is real but modest — the primary evidence for $t_{\rm eff}$ over $M_*$ within $z > 8$ is a $Z = 2.4$ test, not an overwhelming superiority.
- Full sample ($z = 4$–$10$, $N = 4{,}726$) — activation pattern test: $\rho(\text{dust}, t_{\rm eff}) = +0.50$ vs. $\rho(\text{dust}, M_*) = +0.17$; Steiger $Z = 17.8$, $p = 1.3 \times 10^{-70}$. Important framing: this large $Z$ does not primarily measure whether TEP's exponential formula is superior to mass within any given redshift regime — it measures that $t_{\rm eff}$ correctly predicts both the absence of the dust–mass correlation at $z = 4$–$7$ and its emergence at $z > 8$. The activation pattern itself is the signal, not the within-regime slope. Accordingly this is classified as a test of TEP's redshift-dependent activation prediction, not a head-to-head mass vs. $t_{\rm eff}$ comparison. The full-sample $t_{\rm eff}$ vs. $t_{\rm cosmic}$ Steiger test is much weaker ($Z = 2.6$, $p = 0.010$; §3.7.4), confirming that most of the $Z = 17.8$ comes from the cross-regime activation, not from the $\Gamma_t$ scaling per se.
- $t_{\rm eff}$ vs. $t_{\rm cosmic}$ per-survey ($z > 8$): This test is better controlled than the $t_{\rm eff}$ vs. $M_*$ comparison because $t_{\rm cosmic}$ and $t_{\rm eff}$ are measured in the same units and differ only by $\Gamma_t$. The Steiger test is overwhelmingly significant in every survey: UNCOVER ($Z = 6.7$, $p = 1.6 \times 10^{-11}$, $N = 283$), CEERS ($Z = 5.3$, $p = 9.3 \times 10^{-8}$, $N = 71$), COSMOS-Web ($Z = 16.8$, $p = 4.5 \times 10^{-63}$, $N = 2{,}340$). Combined $z > 8$: $Z = 10.4$, $p = 1.8 \times 10^{-25}$. This confirms that the $\Gamma_t$ scaling of the clock adds real information beyond raw cosmic time — the strongest and most cleanly specified test in this section.
- Mass-to-light proxy support: the JADES mass-to-light proxy correlates with $\Gamma_t$ at $\rho = +0.599$ ($p = 1.6 \times 10^{-42}$), strengthening to partial $\rho = +0.741$ after redshift control ($p = 8.5 \times 10^{-75}$).
- TEP mass measurement bias: SED-inferred $M_{*,\rm obs}$ is itself biased by $\Gamma_t^{0.7}$. This creates a self-defeating proxy argument: the strongest form of the mass-proxy concern is difficult to reconcile with the TEP mass-bias hypothesis. The reported partial correlations are conservative lower bounds, understated by $\sim 1.9\times$ at $\beta = 0.7$. Two previously weak results (O32 ionization ratio, H$\beta$ EW) strengthen under debiased mass control (§4.4.6.3).
- COSMOS2025 broad-sample support: across the full $z > 4$ blank-field sample the mass+redshift-controlled dust partial remains positive at $\rho = +0.201$ ($p < 10^{-300}$), and the combined $z > 7$ blank-field subset reaches $\rho = +0.123$ ($p = 1.0 \times 10^{-25}$). The COSMOS2025 branch therefore supports L1 as a broad blank-field tendency rather than as a single isolated $z = 9$–13 partial spike.
- COSMOS2025 blank-field diagnostics: Across the full $z > 4$ sample, the mass+redshift-controlled dust partial remains positive at $\rho = +0.201$ ($p < 10^{-300}$). The sSFR follow-up is mixed rather than uniformly supportive: the matched $z = 8$–9 bin is positive after quality weighting and debiased mass control ($\rho = +0.074$, $p = 3.2 \times 10^{-2}$), whereas the ultrahigh-$z$ $z = 9$–13 bin is negative (weighted debiased $\rho = -0.165$, $p = 1.6 \times 10^{-7}$) and is therefore treated as an auxiliary diagnostic rather than as a primary replication.
- Cross-survey temporal ordering: The effective-time ordering test is recovered independently in UNCOVER, CEERS, and COSMOS-Web with $\Delta\rho_{\rm time} = +0.605$, $+0.711$, and $+0.862$.
- UNCOVER DR4 MegaScience branch: In the broad photometric SPS sample, the dust signal is null below $z = 7$ and then reaches $\rho = +0.492$ at $z = 8$–9, consistent with high-redshift activation of the effect. The apparent $z = 9$–12 null is not driven by sample collapse; the live audit instead points to compressed dust posteriors and substantially inflated redshift uncertainties in that tail, and a new posterior-broad stacked surrogate recovers a positive high-$\Gamma_t$ reddening contrast.
- Dedicated UNCOVER $z > 8$ dust battery: All four targeted tests pass, including the AGB threshold, the cosmic-time-controlled partial correlation, split-sample persistence, and monotonic steepening with redshift.
- UNCOVER $z = 9$–12 posterior-broad stack: Restricting to the posterior-broad tail ($N = 61$) and comparing the upper and lower $\Gamma_t$ quartiles ($N = 16 + 16$) yields a weighted stacked contrast of $\Delta \text{dust2} = +0.249$ with 95% CI $[+0.032, +0.468]$. The same split is redder in both rest-frame colours, with $\Delta(U-V) = +0.341$ and $\Delta(V-J) = +0.335$, both with positive bootstrap intervals. The highest-$z$ UNCOVER weakness is therefore better described as a sensitivity-limited tail whose broad-posterior objects still organize in the TEP-predicted direction when stacked.
- JADES DR5 controlled morphology support: In the preferred direct-mass morphology sample, four structural proxies remain supportive after mass+$z$ control at $z > 7$: $r_{\rm half,F277W/F444W}$ partial $\rho = -0.256$ ($p = 3.7 \times 10^{-7}$), Gini partial $\rho = +0.361$ ($p = 2.9 \times 10^{-13}$), and $\sigma_\star$ partial $\rho = +0.624$ ($p = 7.0 \times 10^{-43}$) for $N = 384$. The support is therefore structural and centrally concentrated rather than a generic raw compactness trend.
- JADES $z = 9$–12 UV-slope companion: A conservative photometric UV-slope sample ($N = 28$ with usable $\beta$ estimates) is directionally consistent with the UNCOVER stack. The raw $\rho(\Gamma_t, \beta)$ is positive ($+0.259$, $p = 0.18$), and the upper-vs-lower $\Gamma_t$ quartile split gives a weighted $\Delta\beta = +0.941$ (redder at higher $\Gamma_t$), albeit with a broad 95% CI $[-0.384, +3.299]$. This branch is therefore treated as a low-power companion rather than as a standalone detection.
- Debiased mass control: Correcting for TEP mass bias strengthens previously weak O32 and H$\beta$-equivalent-width signals by $\sim 1.5\times$–$1.9\times$, as expected if observed stellar masses over-control the effect.
- Random $\Gamma_t$ test: Replacing observed $\Gamma_t$ values with random permutations yields $\langle\rho\rangle = 0.000 \pm 0.062$ ($z$-score $= 9.5$; 0 of 10,000 permutations exceed the observed $\rho = 0.59$).
- Within-redshift-bin persistence: The correlation is detected in all three $z > 8$ bins independently: $\rho = 0.32$ ($z = 8$–$8.5$, $N = 107$, $p = 9 \times 10^{-4}$), $\rho = 0.53$ ($z = 8.5$–$9$, $N = 72$, $p = 2 \times 10^{-6}$), $\rho = 0.73$ ($z = 9$–$10$, $N = 104$, $p < 10^{-18}$), ruling out a pure redshift-confounding origin.
- $\Gamma_t$ vs pure mass: $\Gamma_t$ ($\rho = 0.593$) outperforms both $\log M_*$ ($\rho = 0.559$) and $\log M_h$ ($\rho = 0.575$) as a dust predictor, consistent with the redshift-dependent component of $\Gamma_t$ carrying additional information beyond mass alone.
- Magnitude bias: The correlation is detected in both bright ($\rho = 0.50$) and faint ($\rho = 0.35$) subsamples. Result: 6 of 7 adversarial tests passed.
- Sign consistency: Dust–$\Gamma_t$ ($\rho = +0.59$, $p < 10^{-27}$) and mass–age ($\rho = +0.13$, $p < 10^{-10}$) correlations match predicted signs.
- Magnitude scaling: The correlation strengthens monotonically from low-$\Gamma_t$ quartile ($\rho = 0.42$) to high-$\Gamma_t$ quartile ($\rho = 0.55$), as predicted by a real physical effect.
- Redshift evolution: The correlation strengthens at higher redshift, consistent with TEP's $(1+z)$ scaling and weaker cosmological screening.
A separate nested Bayesian comparison provides a complementary robustness test against explicit astrophysical alternatives rather than against a null alone. Using a four-observable joint likelihood at $z \geq 7$ ($N = 504$; dust, $\log {\rm sSFR}$, $\chi^2$, metallicity), TEP is decisively preferred over Standard Physics ($\ln {\rm BF} = +13.2 \pm 0.8$) and Bursty SF ($+6.0 \pm 0.7$), and very strongly preferred over Varying IMF ($+3.7 \pm 0.8$). The only raw joint competitor is an unconstrained AGN-feedback sigmoid tied almost entirely to stellar mass, which behaves as a mass-threshold surrogate rather than as an independently anchored physical AGN predictor. In the residual-space comparison, once both observables and competing predictors are residualized against linear mass+$z$ trends and the AGN branch is restricted to contamination-sensitive channels, TEP is again decisively preferred over both the residual null ($\ln {\rm BF} = +99.1 \pm 0.7$) and the constrained AGN alternative ($+55.7 \pm 0.7$). This places the strongest Bayesian support on the part of the signal that remains after explicit mass-proxy control, rather than on the raw dust trend alone.
| Line | TEP Prediction | Observed | Significance | Replication |
|---|---|---|---|---|
| L1. Dust–$\Gamma_t$ + AGB threshold | $\rho > 0.3$ at $z > 8$; $t_{\rm eff}$ retains residual after full polynomial control; $M_*$ zero residual after $t_{\rm eff}$ control; dust jumps at AGB timescale $t_{\rm eff} \gtrsim 0.3$ Gyr | $\rho = +0.62$; partial $\rho = +0.26$ ($p = 7.4 \times 10^{-6}$); $M_*$ residual $\rho = -0.006$ ($p = 0.92$); odds ratio 42.8; $\Delta$AIC $\approx -4.8$ vs mass-matched threshold | $p = 2.8 \times 10^{-132}$ (three-survey Fisher); $p \approx 10^{-40}$ (threshold) | UNCOVER, CEERS, COSMOS-Web ($N = 1{,}283$–$2{,}971$); three-survey Fisher combination $z = 24.4\sigma$; dedicated UNCOVER battery passes 4/4. Supplementary DJA-based GOODS-S and Balmer analyses are not part of the primary evidence count. |
| L2. Inside-out core screening | Bluer-core result in more massive galaxies together with higher central concentration at larger $\Gamma_t$ after mass and redshift control; different survey, observable, and physical mechanism from L1 | The preferred JADES DR5 direct-mass morphology branch gives four supportive structural proxies after mass+$z$ control ($r_{\rm half,F277W/F444W}$ partial $\rho=-0.256$; Gini $=+0.361$; $\sigma_\star=+0.624$; $N=384$), while the resolved-gradient branch retains the raw mass trend ($\rho=-0.166$, $p=5.7\times10^{-3}$) and a directionally supportive debiased q33/q67 sign test (negative-gradient fraction $0.581$ vs $0.457$). | $p = 5.7 \times 10^{-3}$ for the mass trend; structural partials $p = 3.7 \times 10^{-7}$–$7.0 \times 10^{-43}$; debiased sign-test Fisher $p = 0.061$; direct gradient partials and the predictor-comparison extension remain non-significant | JADES resolved photometry ($N = 277$) plus the preferred JADES DR5 direct-mass morphology sample ($N_{\rm matched}=464$, $N_{\rm with\,mass}=384$); retained as an ancillary spatial indication because the structural support is strong but the direct gradient discriminator remains non-decisive. |
| L3. Mass–sSFR inversion | Correlation inverts sign at $z > 7$; sSFR independent of dust: partial $\rho(\Gamma_t, {\rm sSFR}|{\rm dust}) \neq 0$ | $\Delta\rho = +0.25$ ($\rho = -0.16 \to +0.09$); partial $\rho = -0.49$ ($p = 10^{-18}$) | 95% CI $[+0.14, +0.35]$ excludes zero | UNCOVER ($N = 2{,}315$) remains the primary L3 line; COSMOS2025 blank-field follow-up is mixed, with a supportive matched $z = 8$–9 bin but a negative ultrahigh-$z$ $z = 9$–13 branch, so it is treated as an auxiliary diagnostic rather than as a primary L3 replication. |
| L4. Dynamical mass comparison | TEP correction resolves $M_*/M_{\rm dyn} > 1$ via isochrony bias; evaluated as a real-data-derived regime comparison against published kinematic literature | Published excess 0.15 dex; TEP reduction 0.400 dex in the RUBIES-like $z\sim4.5$, $\log M_* > 10.5$ regime ($1.41 \rightarrow 0.56$) | Sufficient to remove the published anomaly | Derived regime-level comparison against published literature; not counted with the primary empirical lines |
Statistical independence: The primary empirical evidence package spans two distinct data types — photometric dust correlations (L1) and sSFR kinematics (L3). The L2 resolved-gradient result probes a third observable class and now receives strong controlled structural support from the preferred JADES DR5 direct-mass morphology sample, in which two half-light-radius proxies, Gini, and $\sigma_\star$ all retain the expected sign after mass+$z$ control, but it is still treated as an ancillary spatial indication rather than as part of the primary evidence count. L4 uses kinematic data entirely independent of SED fitting, but it is retained as a derived regime-level comparison rather than as a primary empirical line. Independence is supported quantitatively by the UNCOVER result partial $\rho(\Gamma_t, {\rm sSFR}|{\rm dust}) = -0.49$ ($p = 10^{-18}$), which shows that L3 carries information about $\Gamma_t$ orthogonal to dust. The COSMOS2025 blank-field sSFR follow-up is mixed across adjacent high-$z$ bins and is therefore treated as an auxiliary diagnostic rather than as the main orthogonality test. For broader combined-significance summaries, the manuscript uses the three-survey photometric L1 Fisher combination as the headline statistic and treats omnibus multi-test combinations as supportive context rather than as the primary claim.
Seven new cross-dataset results: These follow-up results strengthen and extend the case without altering the primary evidence count, because they reuse the same predictor families as L1 or L3.
3.9 TEP Predictions vs Observations Summary
Table 9 is best read as a compact consistency summary rather than as a count of independent confirmations. Several of the 12 listed predictions reuse the same underlying $\Gamma_t$ predictor derived from halo mass, so they are not statistically independent. The very high overall correlation ($r = 0.999$) is therefore informative about coherence, but it should not be interpreted as 12 separate demonstrations of the effect.
| Metric | Value | Interpretation |
|---|---|---|
| Raw Fisher combination (5-test synthesis) | $\chi^2 = 643.7$ | $z = 24.4\sigma$ |
| Brown adjustment (correlated tests) | $p = 2.6 \times 10^{-91}$ | $z = 20.3\sigma$ |
| $N_{\rm eff}$-Bonferroni stress test (10% effective $N$) | $p = 1.10 \times 10^{-3}$ | $z = 3.3\sigma$ |
| Effective independent tests | Mean $N_{\rm eff}/N \approx 11\%$ | After spatial-clustering autocorrelation correction |
The strongest evidence rests not on the number of predictions but on the coherence of the evidential structure and its robustness checks (§3.9): two primary empirical lines (L1, L3), together with the ancillary inside-out core-screening indication (L2) and the derived dynamical-mass comparison (L4). Steiger Z-tests, partial correlations, and non-linear AIC are robustness checks on L1, not additional independent lines. Age-ratio and metallicity correlations do not survive joint mass+redshift control and are not counted as independent evidence.
3.9.1 Adversarial Tests
A genuine physical signal should survive attempts to break it. To test whether the dust–$\Gamma_t$ correlation could arise from confounding, selection effects, or artifacts, a battery of adversarial tests is applied:
3.9.2 Falsification Battery
A pre-registered falsification battery tests six necessary conditions for the TEP framework. All six pass:
The full six-condition battery is documented in the supplementary materials.
3.9 Direct Kinematic Decisive Test
A fundamental vulnerability of evaluating TEP using purely photometric samples is the mass-proxy circularity: because $\Gamma_t$ is computed from halo mass (which in turn is inferred from photometric stellar mass), the observed correlations could in principle be driven by an unmodeled standard-physics process that scales with baryonic mass, rather than by a true temporal dilation tracking the gravitational potential.
The JWST-SUSPENSE survey of massive quiescent galaxies at $z = 1.2$–$2.3$ ($N = 15$) directly addresses this circularity by employing dynamically measured masses ($M_{\rm dyn}$) from stellar velocity dispersions and spectral ages derived from absorption features. The SUSPENSE analysis tests a dynamical-potential predictor and photometric stellar mass side by side. The central comparison shows that $\Gamma_t$ predicts spectral age more strongly than stellar mass, yielding $\rho({\rm Age}, \Gamma_t \mid z) = +0.733$ ($p = 1.9 \times 10^{-3}$) compared to $\rho({\rm Age}, M_* \mid z) = +0.493$ ($p = 0.062$). Under joint control of the competing predictor and redshift, $\Gamma_t$ retains a residual association with age, $\rho({\rm Age}, \Gamma_t \mid M_*, z) = +0.624$ ($p = 0.0129$), whereas stellar mass contributes no residual signal once $\Gamma_t$ is controlled, $\rho({\rm Age}, M_* \mid \Gamma_t, z) = -0.036$ ($p = 0.898$). Propagating the published asymmetric uncertainties for all 15 galaxies preserves a positive $\Gamma_t$ residual in 99.7\% of Monte Carlo draws. This one-sided residual structure supports the interpretation that galaxy evolution scales more closely with gravitational potential depth than with baryonic mass alone, and it materially narrows the photometric circularity objection.
A combined kinematic sample of $N = 83$ galaxies ($z = 1.2$–$7.6$) drawn from six independent surveys (SUSPENSE, Esdaile et al. 2021, Tanaka et al. 2019, de Graaff et al. 2024a, Saldana-Lopez et al. 2025, Danhaive et al. 2025) breaks the mass-proxy circularity directly. A sigma-only $\Gamma_t$ computed exclusively from measured velocity dispersion via a literature-calibrated $\sigma$-to-$M_{\rm halo}$ mapping, with zero dependence on SED-fitted $M_*$ or $M_{\rm dyn}$, adds significant predictive power for observed photometric $M_{*,\rm obs}$ beyond $\sigma$ and $z$ individually: partial $\rho(\Gamma_{t,\sigma}, M_{*,\rm obs} \mid \sigma, z) = +0.326$ ($p = 0.0026$, 95% CI $[+0.10, +0.51]$). This result holds at $z \geq 4$ ($\rho = +0.326$, $p = 0.014$, $N = 56$). Because $\Gamma_{t,\sigma}$ encodes the TEP-specific redshift-dependent functional form $\alpha(z) = \alpha_0\sqrt{1+z}$ and the $z$-dependent reference mass, the significant partial demonstrates that the TEP scaling captures real structure in the $M_*$–$\sigma$–$z$ relation that neither $\sigma$ nor $z$ alone can explain. The federated direct-kinematic package comprises three counted supportive branches: the primary SUSPENSE age-based comparison, the object-level $M_*/M_{\rm dyn}$ anomaly resolution, and the sigma-based expansion.
3.10 Strategy for Kinematic Validation
The cleanest direct kinematic test targets the most massive, brightest galaxies at $z > 7$. Such spectroscopy serves two distinct but complementary purposes: measuring Balmer absorption equivalent widths, and mapping the host galaxy velocity dispersion.
1. Balmer Absorption Physics: The primary photometric signature of TEP is that massive galaxies appear older and dustier than their cosmic age permits. This can be tested spectroscopically via Balmer absorption lines (e.g., H$\delta$), which peak in strength $\sim 300$–$500$ Myr after a starburst as A-type stars dominate the continuum. Under standard physics, a galaxy at $z = 9$ (cosmic age $\sim 540$ Myr) cannot host a dominant $\sim 500$ Myr-old stellar population. Under TEP, even a moderately massive halo ($\log M_* \gtrsim 9.5$) at this redshift exceeds $\Gamma_t \approx 3$, the threshold for an effective age of $\sim 1.6$ Gyr — readily allowing for strong Balmer absorption. More massive systems ($\log M_* > 10$) have $\Gamma_t \sim 8$–$22$, making the prediction even stronger. Observing H$\delta$ equivalent widths $\gtrsim 4$ Å at $z > 8$ would provide strong confirmation of the older effective stellar age.
2. IFU Kinematics as a Direct Mass Proxy: As discussed in §4.4.6, the current analysis relies on SED-derived stellar masses to compute $\Gamma_t$, creating a potential circularity. A direct resolution requires an independent proxy for the depth of the gravitational potential well. Spatially resolved kinematics (e.g., from JWST NIRSpec IFU) can map the central velocity dispersion ($\sigma$). Using $\sigma$ rather than $M_*$ to predict $\Gamma_t$—precisely as was done for the local Cepheid calibration and globular cluster pulsars—directly addresses the photometric mass degeneracy.
Falsification Criteria
TEP prediction: $\rho(\Gamma_t, \text{EW}_{H\delta}) > 0.5$, with mean $\Delta$EW $< -1.0$ Å for enhanced-regime galaxies.
Standard physics: $\rho \approx 0$ (no $\Gamma_t$ dependence).
4. Discussion
4.1 The Isochrony Bias Mechanism
The discussion begins from a simple inference. The two primary empirical lines, together with the ancillary resolved-screening branch and the derived dynamical-mass comparison, converge on one physical interpretation: the isochrony axiom fails in massive, unscreened halos at $z > 5$. In this picture, TEP fully resolves the Red Monster star formation efficiency anomaly not by introducing new baryonic physics but by exposing a systematic bias already built into standard stellar-population inference. The mechanism is sequential and concrete. Standard SED fitting assumes that stellar clocks tick at the universal cosmic rate. Under TEP, stars in massive unscreened halos accumulate extra proper time ($\Gamma_t > 1$). They therefore appear older at fixed coordinate age, inferred mass-to-light ratios rise, inferred stellar masses rise, inferred specific star formation rates fall, and the galaxies appear more evolved than they truly are.
The central value $\alpha_0 = 0.58$ of the external Cepheid prior was derived from period-luminosity residuals in local galaxies (Paper 12) and then applied to $z > 5$ galaxies with only the physically motivated redshift scaling $(1+z)^{0.5}$ and no tuning to JWST data. That it fully resolves the anomaly is therefore a non-trivial prediction. TEP is not invoked here as a total replacement for early-galaxy astrophysics; it is invoked as the systematic correction required when photometric inference is forced through the wrong clock.
4.2 Cross-Domain Consistency
The broader force of the TEP case does not rest on JWST alone. It rests on whether one externally calibrated coupling remains consistent across the wider TEP programme. A local Cepheid analysis (Paper 12, TEP-H0) provides the external prior $\alpha_0 = 0.58 \pm 0.16$, which is then carried without re-tuning across domains spanning 13.5 Gyr of cosmic time ($z = 0$ to $z > 10$), 40 orders of magnitude in mass, and 15 orders of magnitude in density (Paper 7). Each domain probes different physics with different instruments and at different epochs; Table 11 summarizes that programme-level structure. The JWST dust-only and joint high-redshift concordance analyses recover values near $\alpha_0 \approx 0.55$, consistent with the external Cepheid prior at $0.2\sigma$. The tighter JWST interval reflects internal agreement among the high-redshift observables; a recovery at $\alpha_0 > 1.5$ or $< 0.1$ would have falsified the broader framework, and the data do not approach either regime.
| Paper | Domain | Key Observable | Effect Size / Significance | $\alpha_0$ Constraint |
|---|---|---|---|---|
| 1 (TEP) | Theory | Two-metric action; synchronization holonomy | Formal derivation; PPN compatible | Free parameter |
| 2–4 (GTE) | GNSS clocks | Spatial correlation $\lambda = 4{,}201 \pm 1{,}967$ km; 7 independent signatures | $p \approx 2 \times 10^{-27}$ ($> 10\sigma$); CMB dipole alignment $5{,}570\times$ | Consistent with $\alpha_0 \sim 0.5$–0.7 |
| 5 (GL) | Gravitational lensing | Phantom mass; $R_{\text{sol}} \propto M^{1/3}$ Earth–galaxy scaling | Screening scaling confirmed; $R^2 = 0.92$ | Consistent |
| 6 (GTE synthesis) | Multi-domain | 7-signature joint probability; raw RINEX validation | 100% detection rate; $t$-statistics up to 112 | $m_\phi \approx (4.3$–5.9$) \times 10^{-14}$ eV/$c^2$ |
| 7 (UCD) | GNSS + SPARC + magnetars | $\rho_c \approx 20$ g/cm$^3$; SPARC slope $0.354 \pm 0.014$ (predicted $1/3$) | 3-source convergence; magnetar $P_{\rm crit}$ 4% match | Screening scale fixed |
| 8 (RBH-1) | Runaway BH wake | Soliton radius $R_{\rm sol} \approx 7.8 \times 10^7$ km; thermal paradox | Geometric consistency; same $\rho_c$, 0 free parameters | Consistent |
| 9 (SLR) | Satellite laser ranging | LAGEOS-1/2 optical confirmation of GNSS correlation structure | Independent optical domain; processing-artifact exclusion | Consistent |
| 10 (EXP) | Precision GR tests | Conformal loophole: GW170817 constrains disformal sector only | 5 structural limitations identified; conformal sector unconstrained | Not constrained by existing tests |
| 11 (COS) | Globular cluster pulsars | 0.13 dex spin-down excess; density slope 0.35 vs 0.82 Newtonian | $p = 1.7 \times 10^{-15}$; $4.0\sigma$ slope tension; binary inversion $p = 0.01$ | Screening threshold $\sigma > 165$ km/s |
| 12 (H0) | Cepheid distance ladder | $\rho(H_0, \sigma) = 0.434$; $\Delta H_0 = 4.63$ km/s/Mpc; $H_0^{\rm TEP} = 68.66 \pm 1.51$ | $p = 0.019$; Planck tension $0.79\sigma$ (from $5\sigma$) | $\alpha_0 = 0.58 \pm 0.16$ (external prior) |
| 13 (This work) | JWST high-$z$ galaxies | 2 primary empirical lines + 1 ancillary spatial indication + 1 derived regime comparison; $N = 4{,}726$; 3 surveys; $z = 4$–$10$ | $z = 24.4\sigma$ (3-survey Fisher combination for L1 alone: GOODS-N, EGS, COSMOS; 3 SED analyses; no clustering correction); dependence-adjusted broader multi-test combination $p = 2.6 \times 10^{-91}$; extreme 10%-$N_{\rm eff}$ stress-test floor $= 3.3\sigma$; all 3 surveys individually $> 5\sigma$ (CEERS $6.7\sigma$, UNCOVER $11.0\sigma$, COSMOS-Web $21.5\sigma$); $t_{\rm eff}$ beats $t_{\rm cosmic}$ per-survey Steiger $Z = 6.4$–$22.6$; 9/9 blind validation; $\Delta$AIC $\approx -4.8$. DJA-based GOODS-S and Balmer analyses are supplementary spectroscopic checks, not part of the primary evidence count. | $\alpha_0 = 0.548 \pm 0.010$ (dust, Pearson $R^2$); $0.548 \pm 0.010$ (joint); both within $0.2\sigma$ of the external prior |
The JWST result therefore probes the framework in its most distant and astrophysically independent regime: the highest redshift ($z > 8$, lookback time $> 13$ Gyr), the largest halo masses ($\log M_h \sim 12$–13), and a data type dominated by population-level photometry rather than by precision timing. Its $0.2\sigma$ agreement with the external Cepheid prior was not guaranteed. A strongly discrepant $\alpha_0$ would have broken the cross-domain consistency that the broader TEP programme requires.
4.2.1 Concordance Tests and Recovery
Further robustness tests validate the directional stability of the signal. Deliberately injecting a negative correlation control yields an immediate penalization ($\Delta\rho = -0.834$), and a deep-dive check of the underlying signal monotonicity across all sub-bins confirms an $88.9\%$ adherence rate, ruling out discontinuous artifacts.
4.3 Alternative Explanations
A credible physical hypothesis must do more than fit the anomaly it targets. It must also explain why the different JWST tensions co-occur and why they organize with redshift and potential depth in the specific way observed. The following subsections therefore compare the leading standard-physics alternatives with the broader TEP evidence package.
4.3.0 Theoretical Implications
The implications of this framework are not confined to any one observable. For example, the age-metallicity decoupling reveals a marked tension in the raw observations ($\rho_{\rm raw} = -0.907$), much of which relaxes once proper-time corrections are applied, restoring the expected ordering of enrichment and age.
4.3.1 Bursty Star Formation
Stochastic bursty star formation can temporarily boost luminosities and alter $M/L$ ratios, potentially mimicking TEP effects. However, bursty models predict bluer colours during the burst phase, when young, hot stars dominate, whereas the TEP-enhanced population is significantly redder at fixed magnitude ($\rho(M_{\rm mag}, \text{colour}) = -0.40$, $p = 2.8 \times 10^{-16}$, $N = 398$). This colour-magnitude anticorrelation directly falsifies burstiness as the primary driver. Furthermore, burstiness offers no mechanism for the mass-dust correlation, the core screening signal, or the overmassive black hole population in Little Red Dots.
4.3.2 Top-Heavy IMF
A top-heavy initial mass function (IMF) would lower the true stellar mass for a given luminosity, partially resolving the efficiency crisis. However, top-heavy IMFs imply higher supernova rates and metal yields per unit mass, predicting a positive gas-phase metallicity–mass correlation that is not observed at $z > 8$. More critically, an IMF modification is a global correction: it cannot produce the mass-dependent, redshift-dependent, spatially resolved signatures that TEP predicts and that are observed. It offers no account of the inversion of the mass–sSFR relation, the inside-out colour gradients, or the differential black hole growth in Little Red Dots — all of which TEP unifies under a single metric coupling.
4.3.3 AGN Feedback Discriminant
Enhanced AGN feedback is a leading alternative to TEP for explaining anomalous high-$z$ galaxy properties. Three observational discriminants are quantified between the two models using Monte Carlo simulations ($N = 500$ galaxies):
- Dust–BH mass correlation: AGN feedback predicts $\rho = -0.38$ (negative — AGN clears dust); TEP predicts $\rho = +0.51$ (positive — halo mass drives both dust and BH growth). The sign difference provides a clean diagnostic.
- Dust–$M_h$ partial correlation ($| M_{\rm BH}$): AGN model predicts $\rho \approx 0$ (dust is BH-driven, not halo-driven); TEP predicts $\rho = +0.57$ (dust is halo-driven via $\Gamma_t$). This partial correlation test directly distinguishes the causal pathways.
- Dust–$\Gamma_t$ correlation: AGN model predicts $\rho = -0.28$ (weak, wrong sign); TEP predicts $\rho = +0.72$ (strong positive). The observed value ($\rho = +0.59$) strongly favors TEP.
These three discriminants represent simulated predictive bounds. They are not yet direct empirical measurements because sufficiently large public broad-line black-hole-mass samples at $z > 6$ are not yet available.
4.3.4 Statistical Model Comparison (AIC, Partial Correlations, and Nested Evidence)
To rigorously distinguish between TEP and standard mass-dependent scaling, models are compared using the Akaike Information Criterion (AIC) and partial correlations on the full UNCOVER dataset ($N=5{,}644$).
- Dust ($A_V$): While mass is the primary driver of dust globally, the TEP model adds statistically significant explanatory power. The partial correlation $\rho(\text{Dust}, \Gamma_t | M_*) = +0.17$ ($p < 10^{-38}$) indicates that $\Gamma_t$ captures variation in dust attenuation that stellar mass alone cannot explain.
- The Z-Dependent Test: A decomposition of $\Gamma_t$ into mass-dependent and redshift-dependent components ($\Gamma_t \propto M^{1/3} \sqrt{1+z}$) reveals that the $z$-dependent term is statistically significant ($p < 0.01$). This indicates that TEP is not merely a mass proxy; the unique $\sqrt{1+z}$ scaling predicted by the scalar field is present in the data and distinct from mass scaling alone.
- sSFR: The TEP model captures the inversion of the mass-sSFR relation at high redshift, which standard mass-only models (assuming constant downsizing) fail to reproduce without ad-hoc evolution terms.
This quantitative evidence suggests that $\Gamma_t$ captures physical information orthogonal to stellar mass—specifically the redshift-dependent screening predicted by the scalar field coupling. A fully nested Bayesian comparison now sharpens that conclusion. In a standardized four-observable joint test at $z \geq 7$ ($N = 504$; dust, $\log {\rm sSFR}$, $\chi^2$, metallicity), TEP is decisively preferred over Standard Physics ($\ln {\rm BF} = +13.2$) and Bursty SF ($+6.0$), and very strongly preferred over Varying IMF ($+3.7$). The only raw joint branch that outperforms TEP is an unconstrained AGN sigmoid keyed almost entirely to stellar mass; in the present sample $\log \Gamma_t$ and $\log M_*$ are highly collinear, so this branch functions as a mass-threshold surrogate rather than as an independently anchored AGN prediction. The more diagnostic comparison is therefore the residual-space test, where both observables and competing predictors are residualized against linear mass+$z$ trends. In that controlled comparison, TEP decisively beats both the residual null ($\ln {\rm BF} = +99.1$) and a constrained AGN contamination model ($+55.7$). The nested evidence therefore points in the same direction as the partial-correlation and non-linear-threshold results: the part of the signal that survives explicit mass+$z$ control is much better organised by TEP than by the tested standard-physics surrogates.
4.3.5 Comprehensive Model Comparison
To rigorously position TEP against competing explanations for the high-$z$ anomalies, a systematic comparison was performed across five candidate mechanisms. Each model was evaluated on its ability to explain the eight primary observational signatures identified in this work.
A full numerical mapping of the screening transition profile confirms this consistency across environmental extremes. The predicted scalar field suppression tracks closely across decades of density spanning the cosmic mean at $z=8$ ($\log\rho = -28.0$), galaxy halos ($\log\rho = -25.0$), molecular clouds ($\log\rho = -20.0$), down to stellar atmospheres ($\log\rho = -7.0$), maintaining the expected scaling exponents throughout the hierarchical transition.
| Observable | TEP | Enhanced AGN Feedback | Top-Heavy IMF | Dust/Attenuation Degeneracy | Bursty SFH |
|---|---|---|---|---|---|
| SFE $> 0.5$ (Red Monsters) | ✓ Predicted (100%) | ✗ Increases SFE | ✓ Partial | ✗ Wrong direction | ✗ Temporary only |
| Dust-$\Gamma_t$ at $z > 8$ ($\rho = +0.62$) | ✓ Predicted | ✗ No mass dependence | ✗ No dust mechanism | ✗ Circular | ✗ No mass scaling |
| Mass-sSFR Inversion at $z > 7$ | ✓ Predicted | ✗ Wrong sign | ✗ No prediction | ✗ No prediction | ✗ Stochastic |
| Overmassive BHs (LRDs) | ✓ Differential shear | ✗ Requires fine-tuning | ✗ No BH mechanism | ✗ No BH mechanism | ✗ No BH mechanism |
| Core Screening (Blue Cores) | ✓ Predicted | ✗ No spatial gradient | ✗ No spatial gradient | ✗ No spatial gradient | ✗ No spatial gradient |
| Environmental Screening (supplementary) | Mixed: full-sample split supports screening, but targeted $z > 8$ contrast is null ($\Delta\rho = 0.003$, $p = 0.97$) | Partial | ✗ No prediction | ✗ No prediction | ✗ No prediction |
| Colour-Gradient Steiger ($t_{\rm eff}$ vs $M_*$) | Partial: raw gradient–$\Gamma_t$ correlation present, but no significant Steiger discrimination over $M_*$ | ✗ No prediction | ✗ No prediction | ✗ No prediction | ✗ No prediction |
| Free Parameters | 0 tuned to JWST; 1 external prior ($\alpha_0$) + 3 fixed reference values | 2 | 2 | 2 | 3 |
| Qualitative coverage (2 primary empirical lines + 1 ancillary spatial indication + 1 derived comparison) | 2/2 + ancillary + derived | 1/2 | 1/2 | 0/2 | 0/2 |
TEP is the only model that simultaneously captures the redshift-dependent mass-dust inversion, the dynamical-mass correction, and the LRD differential-growth mechanism within one coupling framework. A qualitative scoring across the live JWST core evidence package gives TEP 2/2 on the primary empirical lines, with the ancillary L2 spatial indication directionally consistent but not counted in the primary total and the L4 dynamical-mass branch retained as a derived regime comparison. Environmental screening and the colour-gradient Steiger comparison are retained as supplementary branches rather than as decisive discriminants. The new nested-sampling comparison is consistent with this hierarchy: the raw multi-observable test favours TEP over Standard Physics, Bursty SF, and Varying IMF, while the residual-space anti-circularity test decisively favours TEP over both the residual null and a constrained AGN branch. TEP therefore offers a unified account of multiple published anomalies: the dynamical-mass excess, much of the Red Monster efficiency crisis, the benchmark stellar-mass-function excess, and a substantial fraction of the cosmic SFRD tension, all under the same externally anchored Cepheid calibration.
Methodological Note: OLS AIC vs. Step-Function AIC
OLS linear regression on continuous $A_V$ is the wrong functional form for $t_{\rm eff}$: the AGB onset creates a step-function relationship, not a linear one. When the correct functional form is used, the picture reverses. A non-linear AIC comparison ($N = 283$, UNCOVER $z > 8$) tests a step-function model at the AGB threshold ($t_{\rm eff} > 0.3$ Gyr) against a mass-matched step-function model at the same fraction above threshold:
- Step-function $t_{\rm eff}$ (AGB threshold): $\Delta$AIC $= 0$ (best model)
- Step-function $M_*$ (mass-matched quantile): $\Delta$AIC $\approx +4.8$ (worse)
- Linear $M_*$: $\Delta$AIC $= +29$
- Polynomial $M_*, z, M_*\times z$: $\Delta$AIC $= +30$
Both step-function models have identical parameter counts ($k = 2$), so the $\Delta$AIC $\approx -4.8$ advantage for $t_{\rm eff}$ is a direct measure of which threshold better organizes the dust distribution. The OLS AIC rankings (which favored $M_*$) reflect a methodological mismatch—linear regression cannot detect a step-function signal—not a physical preference for mass over effective time. The Steiger Z-test, partial correlations, and non-linear AIC all consistently favor $t_{\rm eff}$.
4.3.6 The Link to Hubble Tension
The same external calibration that explains high-$z$ galaxy anomalies naturally links TEP to the local Hubble tension. Using velocity dispersions ($\sigma$) for 29 SH0ES Cepheid hosts, a significant host-environment trend emerges. Applying the host-dispersion correction $\Delta\mu = \alpha_0 \log_{10}(\sigma_{\rm host}/\sigma_{\rm ref})$ with $\sigma_{\rm ref} = 75.25$ km/s (the SH0ES-weighted anchor dispersion) gives the key quantitative results below.
- Environmental bias detected: Spearman $\rho(H_0, \sigma) = 0.434$, $p = 0.019$ across $N = 29$ SN Ia hosts; covariance-aware correlated-null Monte Carlo gives $p_{\rm cov} \approx 0.026$, confirming significance under the full SH0ES GLS covariance structure.
- Stratified offset: High-$\sigma$ hosts ($\sigma > 90$ km/s, $N = 14$) yield $H_0 = 72.45 \pm 2.32$ km/s/Mpc; low-$\sigma$ hosts ($N = 15$) yield $H_0 = 67.82 \pm 1.62$ km/s/Mpc — consistent with Planck ($67.4 \pm 0.5$ km/s/Mpc) within $1\sigma$. The $\Delta H_0 = 4.63$ km/s/Mpc offset accounts for the majority of the Hubble tension.
- Corrected $H_0$: Applying the TEP correction with the central external-prior value $\alpha_0 = 0.58$ yields $H_0^{\rm TEP} = 68.66 \pm 1.51$ km/s/Mpc, corresponding to a Planck tension of $0.79\sigma$ — down from $5\sigma$ uncorrected. Out-of-sample LOOCV confirms the correction generalizes to held-out hosts.
The same externally calibrated $\alpha_0$, applied without modification to $z > 5$ galaxies in this work, successfully predicts the Red Monster SFE anomaly (fully resolved), the $z > 8$ dust–$\Gamma_t$ correlation ($\rho = +0.62$), the SN Ia mass step (0.050 vs. 0.06 mag observed), and the correct sign of the TRGB-Cepheid offset. The corrected JWST recovery gives $\alpha_0 = 0.548 \pm 0.010$ (dust-only) and $\alpha_0 = 0.548 \pm 0.010$ (joint), both consistent with the external Cepheid prior $0.58 \pm 0.16$ at $0.2\sigma$. The tighter JWST interval reflects internal multi-observable concordance rather than a redefinition of the local prior. Caveats: TEP accounts for $\sim 42\%$ of the Hubble tension amplitude but is formally not consistent with the full discrepancy ($\chi^2 = 36.8$, $p < 10^{-8}$; §C.3.3). Consistent with this interpretation, recent TRGB-based measurements (Freedman et al. 2024) also lie closer to Planck.
4.3.7 Multi-Model Bayesian Comparison and Out-of-Box Testing
The combined evidence synthesis includes a Sellke-calibrated null-versus-TEP comparison across dust, sSFR, black hole, and dynamical-mass domains. In that specific null comparison, the combined evidence is substantial (Sellke-calibrated Bayes factor $\sim 6.6 \times 10^{129}$), but this should not be conflated with the separate multi-model nested-sampling exercise. The latter now has been performed directly on the $z \geq 7$ JWST sample. Its raw multi-observable comparison is supportive rather than absolute: TEP beats Standard Physics, Bursty SF, and Varying IMF, while an unconstrained AGN sigmoid can exploit the strong $\Gamma_t$–mass collinearity and remain competitive. The more informative result is the residual-space control, where TEP decisively outperforms both the residual null and a constrained AGN branch after linear mass+$z$ structure is removed from the observables and competing predictors. The diagnostic model-comparison results therefore now comprise four mutually consistent pieces: the temporal-ordering tests, the cross-survey generalization gap, the out-of-box residual signal after monotonic mass/redshift control, and the residual-space nested Bayesian comparison.
4.4 Caveats and Limitations
4.4.1 Sample Sizes and Statistical Power
The Red Monsters result involves only three objects and is therefore not treated as a standalone power-based detection claim. Its value is as an illustrative zero-parameter case study under an externally anchored prior. The primary statistical evidence comes from the population-level survey analyses, while the enlarged JADES DR4 spectroscopic sample improves the separate spectroscopic consistency check.
The JADES DR4 spectroscopic catalog (D'Eugenio et al. 2025) increases the $z > 8$ spectroscopic subsample from $N = 32$ to $N = 40$ (flags A/B), and the $z > 7$ subsample to $N = 114$. At $N = 40$, the minimum detectable correlation at 80% power is $|\rho| > 0.43$; the observed $|\rho| = 0.997$ far exceeds this threshold. The prior underpowered-sample limitation is therefore resolved for the spectroscopic consistency check, though the UV-based mass estimates ($\pm 0.4$–$0.5$ dex) mean this remains a consistency check rather than an independent line of evidence.
4.4.2 The z > 7 Inversion and the z = 9–10 Reversal
The inversion of the mass-sSFR correlation at $z > 7$ is statistically significant: $\Delta\rho = +0.25$ [+0.14, +0.35] between low-$z$ and high-$z$ samples. The main ambiguity now sits not in the existence of the inversion, but in how far the auxiliary blank-field companion can be pushed into the ultrahigh-$z$ tail. In COSMOS2025, the matched $z = 8$–9 bin remains directionally supportive after debiased-mass control ($\rho_{\rm debiased} = +0.043$; quality-weighted $\rho_{\rm debiased} = +0.070$), but the combined $z = 9$–13 bin is negative ($\rho_{\rm debiased} = -0.212$) and remains negative under a reference-mass reweighting sensitivity test ($\rho_{\rm reweighted} = -0.157$).
Decomposing that ultrahigh-$z$ branch shows that the negative sign is concentrated in $z = 9$–10 ($\rho_{\rm debiased} = -0.145$), whereas $z = 10$–13 is statistically weak and near null after debiasing ($\rho_{\rm debiased} = +0.008$). The appropriate interpretation is therefore selection-sensitive rather than a clean contradiction or a clean confirmation: the matched $z = 8$–9 blank-field branch supports L3, while the $z > 9$ tail remains too sensitive to selection structure and uncertainty weighting to bear primary evidential weight. It remains listed as a limitation (§4.12, item 6) pending deeper and more complete $z > 9$ samples.
4.4.3 Screening
The resolved-screening branch is stronger than the gradient-only summary suggested. In the preferred JADES DR5 direct-mass morphology sample, four structural proxies remain supportive after mass+$z$ control: $r_{\rm half,F277W/F444W}$ partial $\rho = -0.256$ ($p = 3.7 \times 10^{-7}$), Gini partial $\rho = +0.361$ ($p = 2.9 \times 10^{-13}$), and $\sigma_\star$ partial $\rho = +0.624$ ($p = 7.0 \times 10^{-43}$) for $N = 384$. The resolved colour-gradient subset still finds the raw mass-gradient trend $\rho(M_*, \nabla_{\rm color}) = -0.166$ ($p = 5.7 \times 10^{-3}$), while the direct $\Gamma_t$-gradient association is weaker at $\rho = -0.105$ ($p = 8.1 \times 10^{-2}$). Using the live direct-object bootstrap debias exponent $\beta = 0.521$, the residual $\Gamma_t$ signal remains non-significant under both observed-mass+$z$ and debiased-mass+$z$ control (partial $\rho = +0.011$, $p = 0.85$; partial $\rho = -0.015$, $p = 0.80$). The predictor-comparison extension likewise remains non-significant ($Z = 1.92$, $p = 0.055$), yet the debiased q33/q67 sign split is directionally supportive, with negative-gradient fraction $0.581$ versus $0.457$ (Fisher $p = 0.061$; $\Delta = -0.063$). The honest conclusion is therefore that L2 remains ancillary, but it is anchored by strong direct-mass morphology support rather than by a weak gradient-only branch.
4.4.3.1 Environmental Screening: Supplementary Mixed Branch
The environmental-screening branch no longer supports a simple all-regime headline. Two live results point in different directions depending on how the test is posed. In the full $z = 4$–10 sample, the field-vs-overdense split is directionally consistent with screening: $\rho_{\rm field} = -0.006$, $\rho_{\rm dense} = -0.198$, giving $\Delta\rho = +0.192$ ($Z = 4.68$, $p = 2.9 \times 10^{-6}$). However, the targeted $z > 8$ contrast — the regime most relevant to the primary dust anomaly — is null: $\rho_{\rm field} = 0.574$, $\rho_{\rm dense} = 0.571$, $\Delta\rho = 0.003$ ($Z = 0.04$, $p = 0.97$). The mass-matched quintile deltas in this high-redshift slice are likewise mixed rather than uniformly positive.
A separate environment-density residual test reinforces the conclusion that environment matters for dust, but not in a way that cleanly maps one-to-one onto the TEP field. At fixed mass and fixed redshift, local density retains partial $\rho(\text{density}, \text{dust}\,|\,M_*, z) = +0.069$ ($p = 8.5 \times 10^{-4}$), and the signal persists after additionally controlling for $\Gamma_t$ ($\rho = +0.070$, $p = 7.2 \times 10^{-4}$). This means environment carries dust-relevant information beyond the current $\Gamma_t$ mapping, consistent with mergers, stripping, and group pre-processing contributing alongside any TEP screening effect.
A separate DJA/JADES protocluster-switch companion rebuilt on real DJA root field labels and within-field density quartiles is likewise mixed/null: the primary dense-minus-field $\beta$-residual contrast is $+0.086$ with 95% CI $[-0.019,+0.188]$. The honest interpretation is therefore mixed. Full-sample environment splitting is suggestive and directionally compatible with screening, but neither the direct $z > 8$ contrast nor the rebuilt spectroscopic-age companion provides a new independent positive line. Environmental screening is retained as a supplementary consistency branch rather than as part of the primary evidence count.
(See Figure 6 in §3.5 for an illustration of the two screening mechanisms.)
4.4.4 Model Dependence and M/L Scaling
The TEP model assumes that inferred stellar mass scales with the assumed age of the stellar population as $M/L \propto t^n$. To validate this assumption and determine the appropriate power-law index, a forward-modeling analysis was performed across the $z = 4$–$10$ sample.
Standard stellar population synthesis (SPS) models (e.g., Bruzual & Charlot 2003) predict $n \approx 0.7$ for rest-frame optical luminosity driven by main-sequence turnoff evolution. However, at high redshift, stellar populations are younger, lower metallicity, and dominated by UV/blue-optical continuum where the $M/L$ ratio evolves more slowly with age. The forward-modeling analysis reveals a redshift-dependent preference:
- $z = 4$–$6$: Best-fit $n = 0.9$ (consistent with older, standard SSP models)
- $z = 6$–$8$: Best-fit $n = 0.5$
- $z > 8$: Best-fit $n = 0.5$
The global best-fit is $n \approx 0.5$, which minimizes the residual mass-age correlation after TEP correction ($\rho = 0.002$, $p = 0.91$). This lower exponent ($n = 0.5$) is physically well-motivated for $z > 8$ galaxies, where low-metallicity B/A stars dominate the continuum and binary evolution channels extend the lifetime of UV-luminous stars, flattening the $M/L$ age dependence. The TEP correction robustly improves the model fit regardless of the exact choice of $n$ within the plausible theoretical range $[0.5, 0.7]$, but $n = 0.5$ is adopted as the primary calibration for the highest-redshift tests.
4.4.5 Compatibility with Precision Tests of GR
TEP satisfies all current precision tests through chameleon screening: solar system (thin-shell $\Delta R/R \lesssim 10^{-6}$ satisfies Cassini bounds), gravitational waves ($c_g = c_\gamma$ in the conformal limit), binary pulsars (fully screened at $\rho \sim 10^{14}$ g/cm³), and cosmological bounds (BBN satisfied by $\sim 10^{12}\times$ margin; $\sigma_8$ preserved by Yukawa suppression with $\beta_{\rm eff} \approx 0.005$ on $R_8$ scales from the scale-dependent growth calculation in §2.3.2.7 and Appendix A.1.8.6). The JWST core screening gradient ($\rho = -0.18$) provides an independent consistency check. Full details are in Appendix C.1.
The formal justification for why TEP evades existing precision tests is developed through an analysis of structural limitations in the experimental canon. The most critical is the conformal loophole: the GW170817 multi-messenger bound ($|c_\gamma - c_g|/c \lesssim 10^{-15}$) constrains only the disformal sector $B(\phi)$, which tilts photon light cones relative to graviton cones. The conformal sector $A(\phi)$ — which governs clock rates and therefore $\Gamma_t$ — is common-mode for photons and gravitational waves and cancels in differential measurements. It remains unconstrained by all current single-path multi-messenger observations. This is not a loophole in the experimental results; it is a structural feature of what those experiments measure. Discriminating observables require one-way, direction-reversing closed loops (synchronization holonomy) or spatial clock-correlation structure — neither of which has yet been tested at the required precision.
4.4.6 Breaking Mass Circularity
A central concern throughout this manuscript is circularity: $\Gamma_t$ is inferred from stellar mass, while TEP itself corrects stellar mass. The aim of this section is not to dismiss that concern, but to show why the data do not behave like a simple mass proxy and why the strongest remaining route forward is kinematic. The case against the pure-proxy interpretation is therefore presented in three levels of increasing stringency.
Level 1 — What a mass proxy predicts vs. what is observed. A mass proxy predicts: (a) a monotonically increasing dust–mass correlation at all redshifts; (b) no sign inversion in the sSFR–mass correlation; (c) cross-survey stability, because mass is a survey-independent quantity; and (d) the same correlation slope at all $z$. The data show the opposite on every count: the dust–mass correlation is absent at $z = 4$–$7$ and emerges at $z > 8$; the sSFR–mass correlation inverts sign at $z > 7$ (L3); a polynomial mass proxy fails cross-survey generalisation ($R^2 = -6.4$, §3.7.5); and the slope strengthens with a $(1+z)^{0.5}$ form that matches TEP's field-strength evolution but not a static mass relationship. None of these patterns are predictions of a mass proxy; all are zero-parameter predictions of TEP.
Level 2 — The self-defeating mass-bias argument. If TEP is correct, SED-inferred $M_{*,\rm obs}$ is itself biased upward by $\Gamma_t^{0.7}$ (§4.4.6.3). Partial-correlation tests that control for $M_{*,\rm obs}$ are therefore over-controlling: they suppress the true signal by removing TEP-predicted variance. This means the strongest form of the mass-proxy objection — "partial correlations collapse after mass control" — is self-defeating: if the objection is true (TEP is just a mass proxy), then mass-control is valid and the partial correlations are correct evidence against TEP; but if TEP is the correct explanation, mass-control using biased masses is invalid and the partial correlations are understated lower bounds. The claim cannot simultaneously be true in both directions.
Level 3 — A kinematically independent test. A regime-level kinematic comparison asks whether the TEP correction is large enough to remove the published stellar-to-dynamical mass anomaly. In the RUBIES-like $z \sim 4.5$, $\log M_* > 10.5$ regime, the published mean excess is 0.15 dex while the TEP prediction yields a 0.400 dex reduction, equivalent to a representative ratio shift $1.41 \rightarrow 0.56$. This is sufficient to render the anomaly physically plausible and is not readily explained as a mass-proxy artefact, because the dynamical-mass denominator is kinematic rather than photometric. The live L4 branch is now also accompanied by a screened DJA pilot line-width test: in the quality-screened subset of 15 fitted spectra, the Balmer dust proxy subset ($N = 7$) gives partial $\rho(\log {\rm Balmer}, \sigma \mid M_*, z) = +0.887$ ($p = 0.045$), whereas the competing partial $\rho(\log {\rm Balmer}, M_* \mid \sigma, z) = -0.382$ ($p = 0.53$) is weak and of the wrong sign for a mass-dominated interpretation. That pilot remains conservatively classified as supportive rather than decisive because the public DJA spectra presently require grating-fallback instrumental resolution instead of per-pixel $R$. The overall L4 branch is therefore retained as a real-data-derived regime comparison rather than as a primary empirical line counted alongside L1 and L3.
Taken together, these tests span four observable classes: photometric dust correlations, resolved spatial structure, sSFR behaviour, and regime-level kinematics. They are not all counted equally. L1 and L3 remain the primary empirical lines; L2 remains ancillary; L4 remains a derived regime comparison. The AGB threshold and cross-survey generalization are counted as robustness checks on L1 because they reuse the same dust observable, and the colour-gradient Steiger test remains provisional.
The two primary empirical lines plus one ancillary spatial indication and one derived regime comparison:
- L1. Dust–$\Gamma_t$ correlation and AGB threshold: $\rho = +0.62$ ($N = 1{,}283$, three-survey meta-analysis). After controlling for $M_*$, $z$, and $M_* \times z$, $t_{\rm eff}$ retains $\rho = +0.26$ ($p = 7.4 \times 10^{-6}$). Conversely, $M_*$ carries zero residual after $t_{\rm eff}$ control ($\rho = -0.006$, $p = 0.92$). The AGB-onset step function at $t_{\rm eff} \gtrsim 0.3$ Gyr yields odds ratio 42.8 ($p \approx 10^{-40}$; $N = 2{,}971$) and beats a mass-matched threshold by $\Delta$AIC $\approx -4.8$, confirming the specific physical mechanism (§3.7.3–3.7.5).
- L2. Inside-out core screening (ancillary spatial indication): JADES color-gradient analyses find $\rho(M_*, \nabla_{\rm Color}) = -0.166$ ($p = 5.7 \times 10^{-3}$, $N = 277$): bluer cores in more massive galaxies, opposite to standard inside-out growth. Data provenance note: both the raw gradient correlation and the Steiger comparison are now measured from real JADES resolved photometry, but the Steiger test comparing $t_{\rm eff}$ vs $M_*$ remains non-significant ($Z = 1.92$, $p = 0.055$), and the residual $\Gamma_t$ signal remains non-significant under both observed-mass+$z$ and debiased-mass+$z$ control (partial $\rho = +0.011$, $p = 0.85$; partial $\rho = -0.015$, $p = 0.80$). The strongest ancillary support instead comes from the preferred direct-mass DR5 morphology sample, where two half-light-radius proxies, Gini, and $\sigma_\star$ remain supportive after mass+$z$ control, together with the directionally supportive debiased q33/q67 sign split. Different survey (JADES), different observable (resolved photometry), different physical mechanism (screening) from L1 (§3.5.1).
- L3. Mass–sSFR inversion: Correlation inverts from $\rho = -0.16$ at $z = 4$–6 to $\rho = +0.09$ at $z > 7$ ($\Delta\rho = +0.25$, 95% CI $[+0.14, +0.35]$). Independence from L1 verified: partial $\rho(\Gamma_t, {\rm sSFR}|{\rm dust}) = -0.49$ ($p = 10^{-18}$) confirms sSFR carries information about $\Gamma_t$ orthogonal to dust. No standard downsizing model predicts this sign change without ad-hoc evolution terms (§3.3).
- L4. Dynamical mass comparison (derived regime comparison): The regime-level kinematic comparison targets the RUBIES-like $z \sim 4.5$, $\log M_* > 10.5$ regime, where the published mean excess of photometric over dynamical mass is 0.15 dex. The TEP correction independently predicts a mean photometric reduction of 0.400 dex in this exact regime. In ratio-equivalent terms, this corresponds to a representative shift from $1.41 \rightarrow 0.56$, sufficient to render the massive-galaxy anomaly physically plausible. A supplementary five-object direct literature ingestion of massive quiescent galaxies (Esdaile et al. 2021; Tanaka et al. 2019) at $z = 3.2$–$4.0$, including one conservative upper-limit row, yields a mean observed excess of $0.168$ dex, a mean corrected excess of $-0.075$ dex on the exact-mass subset, conservative lower-bound excess metrics for the upper-limit row, and a provisional object-level bootstrap $\beta \approx 0.52$ that is now propagated downstream. At the same time, the anomalous exact-object subset is improved rather than completely eliminated: 2/3 anomalous exact objects are resolved after correction, and the anomalous-only mean reduction falls just short of the anomalous-only mean excess.
This branch materially narrows the photometric mass-proxy objection. Because the dynamical-mass denominator ($M_{\rm dyn}$) relies strictly on kinematic motion rather than SED-fitting, it is immune to photometric age-bias. The agreement between TEP's predicted age-correction direction and the observed kinematic mass ratios is more consistent with a real dynamical coupling than with a purely photometric-scaling interpretation. A screened DJA pilot spectroscopic-width companion also strongly points in this direction: after quality cuts, the Balmer dust proxy tracks the kinematic line width ($\sigma$) more strongly than photometric stellar mass after mass+$z$ control ($\rho_{\rm partial}=+0.887$, $p=0.045$, $N=7$). Though classified as a derived regime comparison rather than a primary empirical line, this combination of kinematic mass and linewidth is presently the strongest available route toward closing the mass-circularity loop, rather than a decisive closure already in hand.
Robustness checks on L1 — not additional independent lines: Steiger Z-tests confirm $t_{\rm eff}$ outperforms $M_*$ across the full sample ($Z = 17.8$, $p = 1.3 \times 10^{-70}$) and per-survey ($Z = 5.3$–$16.8$, all $p < 10^{-7}$; combined $Z = 10.4$, $p = 1.8 \times 10^{-25}$). At $z = 4$–7 ($N = 1{,}811$), by contrast, $t_{\rm eff}$ is a worse predictor than $M_*$ (Steiger $Z = -2.54$, $p = 0.011$), which is consistent with the model’s weak-coupling expectation below the activation redshift.
$\alpha_0$ calibration note: the external Cepheid prior remains $\alpha_0 = 0.58 \pm 0.16$ from $N = 29$ SH0ES Cepheid hosts. The JWST dust-only and joint concordance recoveries both lie near $\alpha_0 \approx 0.55$ and are consistent with that prior, but because the multi-observable fit includes age-ratio and metallicity branches that are mass-proxy-adjacent, these internal values are interpreted as concordance checks rather than as sharper replacement calibrations. Cross-domain agreement is therefore supportive, while the Cepheid prior remains the primary external anchor.
Technical note: An earlier joint-recovery uncertainty collapsed because of a biased RSS artefact with a nearly flat objective. Using the corrected Pearson $R^2$ estimator restores the intended interpretation: the JWST recovery is a self-consistency test of the TEP functional form across high-redshift observables. The redshift emergence remains intact: the mass–dust correlation evolves from $\rho \approx 0$ at $z = 4$–6 to $\rho = +0.72$ at $z = 9$–10, fixed-mass bins retain the signal at both low and high mass, and the non-linear AGB threshold still beats the mass-matched threshold by $\Delta\text{AIC} \approx -4.8$.
Additionally confirmed but not counted as independent lines: Environmental screening remains a supplementary mixed branch: the full-sample split is suggestive, but the targeted $z > 8$ contrast is null ($\Delta\rho = 0.003$, $p = 0.97$; §4.4.3), and the rebuilt DJA/JADES protocluster-switch companion is likewise mixed/null. Colour-gradient Steiger shows a real-data ancillary correlation but no significant Steiger discrimination over $M_*$ ($Z = 1.92$, $p = 0.055$) and no significant residual $\Gamma_t$ partial under observed-mass or debiased-mass control, and is therefore not counted.
Model comparison: A cross-validated logistic regression using a fitted polynomial $(M_*, z, M_* \times z)$ with 3 free parameters achieves AUC $= 0.851$ for dust classification at $z > 8$, compared to $0.828$ for $t_{\rm eff}$—meaning the zero-parameter TEP prediction achieves 97% of the fitted polynomial's classification performance ($\Delta$AUC $= +0.023$, bootstrap 95% CI $[+0.016, +0.030]$). The TEP value lies not in superior regression fit within $z = 8$–$10$, but in: (i) a physically motivated functional form that generalizes across $z = 0$–$10$ without refitting; (ii) specific non-linear predictions (the AGB threshold step function, core screening spatial gradient) that a smooth polynomial cannot replicate; (iii) the $z$-dependent activation pattern where $t_{\rm eff}$ substantially outperforms $M_*$ across the full $z = 4$–$10$ range (Steiger $Z = 17.8$); (iv) per-survey replication of the temporal transformation ($t_{\rm eff}$ vs. $t_{\rm cosmic}$ Steiger $Z = 5.3$–$16.8$, all $p < 10^{-7}$); and (v) cross-survey generalization: a leave-one-survey-out test shows the fitted polynomial's $R^2$ collapses from $+0.47$–$0.56$ (within-survey) to $-1.4$ to $-6.4$ (cross-survey), while $t_{\rm eff}$ maintains stable $\rho = 0.60$–$0.80$ across all three surveys with no training (§3.7.5). The polynomial absorbs survey-specific SED systematics; $t_{\rm eff}$ does not. The one leave-one-out loss for $t_{\rm eff}$ is COSMOS-Web ($\Delta\rho = -0.0014$, negligible).
The $z = 6$–$7$ bin shows a marginally negative correlation ($\rho = -0.12$, $p = 0.04$), explained by standard dust physics: supernova-driven destruction outpaces AGB production during this epoch ($\rho_{\rm sSFR,dust} = -0.34$, the most negative value among the bins). TEP coexists with this standard physics effect.
4.4.6.1 Direct Mass-Proxy Breaking Tests
Three additional tests directly probe whether $\Gamma_t$ encodes information beyond a mass-redshift proxy:
- Environment-density residual test: At fixed mass AND fixed redshift, local environment density (5th-nearest-neighbor, $z$-windowed) predicts dust attenuation: partial $\rho(\text{density}, \text{dust}\,|\,M_*, z) = +0.069$ ($p = 8.5 \times 10^{-4}$, $N = 2{,}315$). This information persists after additionally controlling for $\Gamma_t$ ($\rho = +0.070$, $p = 7.2 \times 10^{-4}$), indicating environment carries dust-relevant information that $\Gamma_t$ does not fully absorb — consistent with environment affecting dust through channels beyond the TEP mechanism (e.g., ram-pressure stripping, mergers). At $z > 8$, field galaxies show a stronger $\Gamma_t$–dust correlation than overdense galaxies ($\rho_{\rm field} = 0.66$ vs. $\rho_{\rm dense} = 0.55$, $\Delta\rho = +0.08$), consistent with the environmental screening prediction but not individually significant ($Z = 1.13$, $p = 0.26$, limited by $N = 283$).
- Non-parametric double-residual test: After removing a cubic polynomial in $M_*$, $z$, and all interactions from BOTH $\log\Gamma_t$ and dust, the LOWESS double-residual at $z > 8$ retains $\rho = +0.24$ ($p = 4.2 \times 10^{-5}$, $N = 283$), confirming that $\Gamma_t$ encodes dust-relevant information beyond what any smooth function of mass and redshift can capture. The partial Spearman $\rho = +0.26$ ($p = 8.1 \times 10^{-6}$) after mass+$z$ control corroborates this. Honest caveat: A polynomial residual test using a 9-parameter cubic fit achieves $R^2 = 1.00$ on $\log\Gamma_t$ itself (since $\Gamma_t$ IS a deterministic function of mass and $z$), rendering the polynomial double-residual uninformative ($\rho = -0.006$, $p = 0.92$). The LOWESS and rank-based methods avoid this overfitting issue because they do not closely explain the non-linear $\Gamma_t$ form. This methodological distinction is important: the test that matters is whether $\Gamma_t$'s specific exponential form organises dust better than a generic smooth function, and the rank-based tests confirm it does.
- Shuffled-mass null: Within narrow $z$-bins ($\Delta z = 0.5$), shuffling stellar masses disrupts the $\Gamma_t$–dust correlation at $z > 8$: observed $\rho = 0.59$, shuffled mean $\rho = 0.002 \pm 0.064$ ($z$-score $= 9.2$, $p < 5 \times 10^{-4}$). This confirms mass ordering within $z$-bins is essential — as expected for any mass-dependent model. The test does not distinguish TEP from other mass-dependent mechanisms; it rules out a purely $z$-driven artefact.
4.4.6.2 Adversarial Machine Learning Attack
A stringent mass-proxy test gives a gradient-boosted regressor (GBR; 200 trees, depth 4) every available feature — $M_*$, $z$, SFR, sSFR, metallicity, age ratio, plus 6 polynomial interaction terms — and lets it learn any function to predict dust. Then $\Gamma_t$ is added. If $\Gamma_t$ provides measurable "lift," it encodes information no function of standard features can replicate.
| Feature set | $N_{\rm feat}$ | GBR $R^2$ | GBR $\rho$ |
|---|---|---|---|
| $M_*$, $z$ polynomial | 8 | $0.235 \pm 0.126$ | $0.358$ |
| + extras (SFR, sSFR, met, mwa) | 12 | $0.563 \pm 0.100$ | $0.686$ |
| + $\Gamma_t$ only | 10 | $0.228 \pm 0.115$ | $0.347$ |
| + extras + $\Gamma_t$ | 14 | $0.557 \pm 0.091$ | $0.678$ |
| $\Gamma_t$ alone | 2 | $-0.013$ | $0.191$ |
| $M_*$ alone | 1 | $-0.148$ | $0.008$ |
| $z$ alone | 1 | $-0.050$ | $0.308$ |
Three honest findings from the adversarial attack:
- $\Gamma_t$ adds zero ML lift ($\Delta R^2 = -0.006$). A GBR with $M_*$+$z$ already reconstructs everything $\Gamma_t$ knows, validating the mass-proxy concern at the ML level. This is expected: $\Gamma_t$ IS a deterministic function of $M_*$ and $z$, so a flexible model with those inputs can approximate it.
- Standard astrophysics dominates. Adding SFR, sSFR, metallicity, and age ratio jumps $R^2$ from 0.24 to 0.56 — these carry far more dust information than $\Gamma_t$.
- Cross-survey: all ML models generalise poorly ($R^2 = -2.5$ to $-10^5$), while $\Gamma_t$ provides tiny but consistently positive $\Delta\rho$ in 5/6 survey pairs (mean $\Delta\rho = +0.014$). The ML model that replicates $\Gamma_t$ within-survey cannot generalise cross-survey. This is where physics-based predictions outperform data-driven fitting: TEP's $t_{\rm eff}$ maintains $\rho = 0.60$–$0.80$ cross-survey with no training (§3.7.5).
Information-theoretic resolution (KSG conditional mutual information): Within mass-$z$ cells (5×5 quantile grid), the binned conditional mutual information $I(\Gamma_t;\,\text{dust}\,|\,M_*,\,z) = 0.329$ nats at $z > 8$ ($z$-score $= 20.7$ vs. shuffled null, $N = 283$), while $I(M_*;\,\text{dust}\,|\,\Gamma_t,\,z) = 0.183$ nats. $\Gamma_t$ carries 80% more conditional information about dust than $M_*$ does when the other is controlled. The full-sample result (nats $= 0.018$, $z$-score $= 7.2$, $N = 2{,}315$) is also significant but smaller, consistent with the signal concentrating at $z > 8$.
Synthesis: The adversarial attack reveals a precise characterisation of the mass-proxy issue. $\Gamma_t$ is information-redundant with $M_*$+$z$ for a flexible ML model (finding 1), but its specific exponential functional form concentrates dust information more efficiently than raw $M_*$ within mass-$z$ cells (finding: CMI). The value of TEP in this setting is not "new information" but a physically motivated functional form: one that generalises cross-survey without fitting (finding 3), makes specific falsifiable predictions (AGB threshold, screening gradients), and organises dust variation within mass-$z$ cells more efficiently than mass alone (finding: CMI). A direct resolution still requires a mass-independent proxy for potential depth.
4.4.6.3 A Critical Test: IFU Velocity Dispersions
The fundamental vulnerability of the current analysis is the use of SED-derived stellar masses ($M_{*,\rm obs}$) to estimate halo mass, from which $\Gamma_t$ is computed. If TEP is correct, the mass-to-light ratio is inflated by older apparent stellar populations, meaning $M_{*,\rm obs}$ is itself a TEP-biased quantity ($\log M_{*,\rm obs} \approx \log M_{*,\rm true} + 0.7 \log \Gamma_t$). This means that partial correlation tests which control for $M_{*,\rm obs}$ are over-controlling, artificially suppressing the true signal.
The cleanest direct test of this circularity relies on kinematics. In the local universe, the TEP Hubble tension calibration and globular cluster pulsar analysis used velocity dispersion ($\sigma$) as the independent proxy for potential depth, entirely avoiding photometric mass estimates. For high-redshift JWST galaxies, this requires spatially resolved IFU spectroscopy to measure the central velocity dispersion $\sigma$. Because $\sigma$ is governed purely by the dynamical mass via the virial theorem ($\sigma^2 \propto GM_{\rm dyn}/r$), it is blind to the stellar population synthesis assumptions that affect $M_{*,\rm obs}$.
A combined kinematic sample of $N = 83$ galaxies spanning $z = 1.2$–$7.6$ and six independent surveys (Slob et al. 2025 SUSPENSE; Esdaile et al. 2021; Tanaka et al. 2019; de Graaff et al. 2024a; Saldana-Lopez et al. 2025; Danhaive et al. 2025) provides this test. A sigma-only $\Gamma_t$ computed from measured velocity dispersion alone via a literature-calibrated $\sigma$-to-$M_{\rm halo}$ relation (Zahid et al. 2016; Bogdan et al. 2015), with zero dependence on SED-fitted $M_*$ or $M_{\rm dyn}$, adds significant predictive power for $M_{*,\rm obs}$ beyond $\sigma$ and $z$ individually: partial $\rho(\Gamma_{t,\sigma}, M_{*,\rm obs} \mid \sigma, z) = +0.326$ ($p = 0.0026$, 95% CI $[+0.10, +0.51]$). Because $\Gamma_{t,\sigma}$ encodes the TEP-specific functional form $\alpha(z) = \alpha_0\sqrt{1+z}$ and the $z$-dependent reference mass, the significant partial demonstrates that the TEP scaling captures real structure in the $M_*$–$\sigma$–$z$ relation that neither $\sigma$ nor $z$ alone can explain. The result holds at $z \geq 4$ ($\rho = +0.326$, $p = 0.014$, $N = 56$). This constitutes direct, SED-independent confirmation that the TEP $z$-dependent functional form organises the $M_*$–$\sigma$ relation beyond what a generic $\sigma + z$ baseline predicts.
4.4.6.4 The Mass Measurement Bias: Addressing Over-Control
Until large IFU kinematic samples are available at $z > 7$, analyses must grapple with the fact that $M_{*,\rm obs}$ is a biased control variable. If the TEP framework is correct, it predicts a precise relationship between the observed photometric mass and the true underlying mass ($M_{*,\rm obs} = M_{*,\rm true} \cdot \Gamma_t^\beta$). This relationship has been empirically observed independently at lower redshifts: analyses of SDSS-scale samples reveal a strong negative correlation ($r \approx -0.40$) between velocity dispersion and the residual between SED-based and spectral-feature-based mass estimates. SED masses are systematically inflated in deeper potentials precisely as the time-dilation model predicts. Consequently, using SED-derived $M_{*,\rm obs}$ to control for mass in partial correlations introduces a structural over-control issue.
This creates a mutually exclusive dilemma for the mass-proxy concern:
- If $\Gamma_t$ is just a mass proxy ($M_{*,\rm obs}$ and $\Gamma_t$ are interchangeable), then $M_{*,\rm obs}$ already contains $\Gamma_t$ information. Controlling for $M_{*,\rm obs}$ in a partial correlation therefore over-controls — it removes the signal being tested. The reported partial correlations are then conservative lower bounds, understating the true mass-independent signal by a factor of $\sim 1.9\times$ (assuming $\beta = 0.7$).
- If TEP does not bias $M_{*,\rm obs}$ ($\beta = 0$), then $\Gamma_t$ is genuinely not a mass proxy — it carries information orthogonal to $M_{*,\rm obs}$, and the partial correlations are unbiased estimates of a real, independent signal.
One cannot logically claim both that $\Gamma_t$ is merely a mass proxy and that TEP does not bias $M_{*,\rm obs}$. Simulation supports this quantitatively: at $\beta = 0.7$, controlling for $M_{*,\rm obs}$ suppresses the true partial $\rho$ from $0.435$ to $0.285$ — a $34\%$ reduction ($N = 10{,}000$ simulated galaxies, $z = 4$–$10$). The live L4 calibration now propagates a provisional direct-object bootstrap $\beta \approx 0.52$ (95% CI $[0.299,1.005]$), implying a correction factor of roughly $1.38\times$, while the theoretical value $\beta = 0.7$ implies a more conservative $1.53\times$ factor. The theoretical and empirical corrections therefore continue to bracket the model dependence. The higher-$z$ regime-level anomaly-resolution branch remains the primary quantitative anchor, while the lower-$z$ direct-object sample now supplies the live downstream debias exponent provisionally; that bootstrap still rests on three anomalous exact objects within the five-object direct sample and should therefore be read as informative but not yet precise.
Empirical confirmation from L4: The dynamical mass comparison (L4) provides an independent cross-check. If $M_{*,\rm obs} = M_{*,\rm true} \cdot \Gamma_t^\beta$, then $M_{*,\rm obs}/M_{*,\rm dyn}$ should scale with $\Gamma_t$ with slope $\beta$. In the RUBIES-like regime, the representative ratio shifts from $1.41$ to $0.56$ at typical $\Gamma_t \approx 6.31$, implying $\beta = \log(1.41/0.56)/\log(6.31) \approx 0.50$. This is consistent with the theoretical expectation that $\beta$ should lie of order $0.5$–$0.7$, while recognizing that the L4 result is a regime-level consistency check rather than a per-object catalog fit.
Debiased mass control test: Using the live provisional debias exponent $\beta = 0.52$, a debiased mass estimate $\log M_{*,\rm true} \approx \log M_{*,\rm obs} - 0.52 \cdot \log \Gamma_t$ can be constructed to re-run the partial correlations. Doing so causes previously-null results to emerge as real signals:
- O32 ionization ratio at $z > 7$ ($N = 344$): $M_{*,\rm obs}$ control gives $\rho = -0.165$ ($p = 0.002$, marginal); debiased $M_{*,\rm true}$ control gives $\rho = -0.204$ ($p = 1.3 \times 10^{-4}$, significant). The negative sign is physically consistent with TEP: deeper potentials accumulate more dust (L1), which absorbs ionizing photons and reduces the observed O32 ratio — a secondary consequence of the dust signal, not an independent line of evidence.
- H$\beta$ equivalent width at $z > 7$ ($N = 837$): $M_{*,\rm obs}$ control gives $\rho = -0.133$ ($p = 1.1 \times 10^{-4}$); debiased control gives $\rho = -0.196$ ($p = 1.1 \times 10^{-8}$, $1.5\times$ stronger). Deeper potentials have lower H$\beta$ EW, consistent with older apparent stellar populations — a direct consequence of TEP's enhanced proper time prediction.
- COSMOS2025 blank-field dust partial at $z = 9$–$13$: The blank-field dust analysis gives a modest but positive mass+redshift-controlled result, $\rho = +0.074$ ($p = 4.3 \times 10^{-3}$; bootstrap 95% CI $[+0.019, +0.117]$). Under the theoretical correction ($\beta = 0.7$; factor $1.875\times$), this lower bound maps to $\rho_{\rm true} \approx 0.14$. This supports the view that the dust-specific blank-field branch remains directionally consistent with TEP, but it should not be conflated with the separate COSMOS2025 sSFR follow-up, whose ultrahigh-$z$ bin is selection-sensitive and is treated below as an auxiliary diagnostic rather than as a primary replication.
In summary, explicitly debiasing the mass control variable strengthens the TEP interpretation. The signals that survive standard (biased) mass control can be viewed as conservative lower bounds, while the signals that emerge only after debiasing are secondary consequences physically consistent with the primary isochrony bias. Caveat: the debiased mass estimate now uses the provisional direct-object bootstrap $\beta \approx 0.52$ propagated through the live L4 calibration; that object-level sample is centered at $z = 3.2$–$3.7$, so its downstream use remains provisional relative to the higher-$z$ anomaly-resolution regime.
4.4.6.4 The sSFR Sign Inversion (L3)
The strongest argument against a simple mass-proxy bias is the mass–sSFR correlation (L3), which undergoes a sharp sign inversion from $z=4$–7 to $z > 7$. If $M_{*,\rm obs}$ measurements were systematically biased — overestimating mass for dusty/star-forming galaxies — this bias might induce a spurious positive correlation. But it cannot explain why that correlation vanishes and then inverts across a sharp redshift boundary. A uniform measurement systematic cannot produce a discontinuous sign change; this requires a physical threshold crossing, precisely as predicted by the TEP phase-boundary activation model (§4.4.8).
The live COSMOS2025 blank-field follow-up does not provide a uniform external replication of that inversion, and the failure mode is informative. In the matched $z = 8$–9 bin, the partial correlation is positive ($\rho = +0.067$, $p = 2.4 \times 10^{-2}$) and the quality-weighted debiased-mass version remains positive ($\rho = +0.070$, $p = 4.4 \times 10^{-2}$). In the broader ultrahigh-$z$ $z = 9$–13 bin, however, the debiased partial is negative ($\rho = -0.212$, $p = 2.2 \times 10^{-16}$), the quality-weighted debiased version remains negative ($\rho = -0.185$, $p = 3.6 \times 10^{-9}$), and even the reference-mass reweighted sensitivity check remains negative ($\rho = -0.157$, $p = 3.5 \times 10^{-7}$). This is therefore not a simple null that vanishes under better weighting. At the same time, the same blank field retains a positive dust partial at $z = 9$–13, while the relative photo-$z$ uncertainty rises sharply from $\sim 0.10$ in the $z = 8$–9 bin to $\sim 0.36$ in the $z = 9$–13 bin. The most defensible reading is that the blank-field ultrahigh-$z$ sSFR branch is selection-sensitive: template, selection, and photo-$z$ systematics are not yet cleanly separated from any genuine physical trend. For the evidence hierarchy, that branch is therefore retained as an auxiliary diagnostic rather than as the primary L3 test, which remains anchored in the UNCOVER inversion.
4.4.6.5 Independence of L1 and L3
Finally, the dust correlation (L1) and sSFR inversion (L3) remain mathematically orthogonal at the level required for the primary evidence structure. In UNCOVER, partial $\rho(\Gamma_t, {\rm sSFR}|{\rm dust}) = -0.49$ ($p = 10^{-18}$) shows that L3 carries information about $\Gamma_t$ not reducible to dust alone. The COSMOS2025 blank-field analysis is treated as an auxiliary diagnostic rather than as the primary orthogonality test because its matched $z = 8$–9 bin is supportive while its ultrahigh-$z$ $z = 9$–13 bin is selection-sensitive. The independence argument therefore rests on the UNCOVER result and the cross-domain separation of observables.
4.4.7 Robustness Tests
Confounding: The raw $\Gamma_t$–age ratio correlation is weak ($\rho = +0.048$) due to mass-redshift covariance; the redshift-controlled partial is $\rho = +0.14$ ($p = 9.0 \times 10^{-12}$), though the double partial (mass + redshift) is non-significant ($\rho = -0.01$, $p = 0.54$).
Parameter sensitivity: A sweep of $\alpha_0$ from 0.0 to 1.2 shows the $z > 8$ dust correlation remains significant ($p < 0.01$) for all $\alpha_0 > 0.2$ (full sweep plot in Appendix A).
Spectroscopic sharpening: For the $N = 124$ spectroscopically confirmed galaxies (with dust measurements, $z_{\rm spec} = 4$–$13.3$), the $\Gamma_t$–dust correlation is $\rho = +0.522$ ($p = 5.2 \times 10^{-10}$, 95% CI from 1{,}000 bootstraps). The matched photo-$z$ sample ($N = 2{,}311$, same $z$-range) gives $\rho = -0.119$. The signal sharpening $\Delta\rho = +0.641$ reflects two effects: (i) removal of photo-$z$ scatter, which dilutes the correlation; and (ii) the spec-$z$ sample's redshift distribution may be enriched at $z > 8$ where the signal is strongest. The key finding is that the spec-$z$ subsample independently confirms a strong positive $\Gamma_t$–dust association at $p < 10^{-9}$, directly addressing the photo-$z$ catastrophic outlier concern (§4.11 item 3).
Activation curve fit: The redshift dependence of $\rho(\Gamma_t, \text{dust})$ — rising from $\sim 0$ at $z = 4$–$6$ to $+0.73$ at $z = 9$–$10$ — is fitted to five competing functional forms. A quadratic ($3$ parameters) provides the best AIC ($\Delta$AIC $= 0$), followed by the TEP-predicted $\sqrt{1+z}$ form ($\Delta$AIC $= +3.4$, $2$ parameters) and a logistic step function ($\Delta$AIC $= +5.8$). Linear and constant models are strongly rejected ($\Delta$AIC $> +68$). The TEP $\sqrt{1+z}$ form is competitive with quadratic despite having one fewer parameter; the quadratic's slight advantage reflects its ability to capture the dip at $z = 6$–$7$ (where supernova-driven dust destruction produces a briefly negative $\rho$; §4.4.6). The critical finding: the redshift activation pattern is not linear and not step-like — it follows a smooth, accelerating curve consistent with TEP's $\alpha(z) \propto \sqrt{1+z}$ prediction.
Model independence: The Color-Magnitude relation ($\rho = -0.40$) and Compactness-Color anticorrelation ($\rho = -0.13$) exist in raw photometric flux space, independent of SED fitting.
MIRI mass recalibration: A 0.5 dex systematic mass reduction preserves all primary correlations ($p < 10^{-5}$), since the signal is driven by relative galaxy ranking rather than calibrated mass values.
4.4.8 Combined Significance and Systematics
The Physical Narrative Behind the Statistics: The statistical density of this analysis—Fisher combinations, Steiger Z-tests, partial correlations, non-linear AICs—can obscure the underlying physical narrative. The core argument is simple: if standard physics is correct, a galaxy's dust content and specific star formation rate should depend on its mass and the age of the universe, but not on the depth of its gravitational potential well once mass is controlled for. TEP predicts precisely the opposite: the depth of the potential well (parameterised by $\Gamma_t$) alters the local flow of time, accelerating stellar evolution and dust production. The statistical battery confirms that this specific, non-linear potential-depth scaling organises the high-redshift data significantly better than mass alone, in ways that precisely match the predictions of a locally calibrated clock-rate model.
Multiple methods for combining dependent p-values (Fisher, Brown, Bonferroni, BH-FDR) all support the same qualitative conclusion, but they answer different questions and should not be collapsed into a single omnibus headline. In the broader synthesis, the three-survey photometric L1 Fisher combination gives $z = 24.4\sigma$ ($p = 2.8 \times 10^{-132}$). The correlation-adjusted multi-test Brown combination gives $p = 2.6 \times 10^{-91}$ for the broader JWST evidence package. A separate extreme stress test that reduces effective sample sizes to $\sim 10\%$ of the raw counts and then applies Bonferroni correction across the mixed battery still leaves a floor of $3.3\sigma$ ($p = 1.10 \times 10^{-3}$). Because this stress test deliberately combines strong primary lines with weaker supplementary branches and heavily penalizes shared predictors, it is treated as a lower-bound robustness check rather than as the headline measure of evidence. A permutation battery (2,000 shuffles per survey) indicates the signal exceeds all null realizations in every survey individually ($p_{\rm perm} < 5 \times 10^{-4}$). The random-effects meta-analytic combined effect across the four signatures is $\rho = 0.34$ [0.14, 0.51] with high heterogeneity ($I^2 = 98.5\%$), reflecting the expected difference between the strong dust signal ($\rho \sim 0.6$) and weaker secondary effects.
Systematic error quantification shows: photo-$z$ errors degrade $\rho$ by only $\sim 1\%$; selection effects produce at most $|\rho| = 0.12$ (vs. observed 0.62); and all three independent fields show positive correlations. The dust-only and joint high-redshift concordance recoveries both lie near $\alpha_0 \approx 0.55$ and remain consistent with the Cepheid prior. Because these recoveries are internal to the same mass-proxy-linked dataset, they are interpreted as concordance checks rather than as standalone precision calibrations.
4.4.9 Bayesian and Validation Tests
The combined null-versus-TEP evidence synthesis yields a Sellke-calibrated Bayes factor of $\sim 6.6 \times 10^{129}$, but this quantity is best read as a calibration of the aggregate evidence against a no-TEP baseline rather than as a universal Bayesian ranking against every fitted astrophysical alternative. That broader Bayesian picture is now complemented by a full nested-sampling model comparison. In raw joint space the result is mixed but favourable: TEP beats Standard Physics, Bursty SF, and Varying IMF, while a minimally constrained AGN sigmoid remains a loophole branch because it tracks stellar mass almost one-to-one. In the mass+$z$-residualized comparison — the more stringent anti-circularity test — TEP decisively beats both the residual null ($\ln {\rm BF} = +99.1$) and the constrained AGN alternative ($+55.7$). Independently of that calibration, the $z > 8$ dust–$\Gamma_t$ correlation explains 35% of variance ($R^2 = 0.35$, Monte Carlo $z$-score $= 10.1$) with 0/283 influential points in jackknife analysis.
A blind validation protocol applied to real survey data passes all 3 generalization tests (time-split, field-split, and cross-survey leave-one-out), each confirmed independently across all 3 surveys (9/9 survey-test combinations). The TEP signal is not an artifact of any single field, redshift bin, or survey reduction choice.
4.4.10 Additional Validation
Extensive additional validation tests are presented in Appendix C, including: modified gravity theory comparison (TEP matches 8/8 JWST anomaly predictions vs. 1/8 for the next-best theory; Appendix C.3.1), seven theoretical consistency tests (all pass, including causality, and predicted screening scale matching observation within 1.7×; Appendix C.3.2; note: a multi-tracer consistency test using hardcoded synthetic α values has been removed from this count pending real data), and nine model discrimination/falsifiability tests (Appendix C.3.3). Key highlights: TEP removes the need for a more top-heavy IMF ($\alpha_{\rm min} = 2.1$ vs 1.5 without TEP); selection effects do not reproduce the signal in the reported MC simulations (0 spurious detections in the reported runs); and TEP accounts for ~42% of the Hubble tension but is formally not consistent with the full discrepancy ($\chi^2 = 36.8$). Combined prediction uncertainty is $\pm 16.5\%$, providing clear $2\sigma$ falsification thresholds. The M/L scaling justification ($n = 0.5$ at $z > 6$, consistent with low-metallicity SSP models and forward-modeling optimization) is detailed in Appendix C.2.
4.5 The Two Regimes: Enhanced vs. Suppressed
The exponential form of $\Gamma_t$ creates a natural bifurcation in the $z > 8$ galaxy population. Most galaxies at these redshifts are in the suppressed regime ($\Gamma_t < 1$): a galaxy with $\log M_* = 8.5$ at $z = 9$ yields $\Gamma_t \approx 0.36$, meaning its effective age is only $\sim 36\%$ of its cosmic age ($\sim 0.19$ Gyr at $z = 9$) — below the $\sim 0.3$ Gyr AGB dust-production threshold. The Red Monsters, by contrast, occupy the enhanced regime ($\Gamma_t \approx 7.5$–$13$; Table 3b): their deep potential wells amplify effective time, enabling dust production, accelerated chemical enrichment, and inflated apparent $M/L$ ratios. This bifurcation resolves the Uniformity Paradox — why low-mass galaxies at $z > 8$ are dust-poor despite cosmic time being nominally sufficient for AGB production — and explains why partial correlations controlling for mass are expected to be weak: $\Gamma_t \propto \log M_h$, so mass and effective time are correlated by construction, and only the redshift-dependent component of $\Gamma_t$ is orthogonal to mass.
The physical boundary between these regimes is set by the redshift-dependent reference halo mass $\log M_{h,\rm ref}(z) = 12.0 - 1.5\log_{10}(1+z)$, which corresponds to the environment where $\Gamma_t = 1$. At $z = 0$ this is $\log M_h = 12.0$; at $z = 9$ it drops to $\approx 10.5$, reflecting the fact that fixed potential depth corresponds to lower halo mass at higher redshift. The $z = 0$ base scale is not arbitrary: it corresponds to the mass at which the TEP soliton radius $R_{\rm sol} = (3M/4\pi\rho_c)^{1/3}$ equals the halo virial radius for $\rho_c \approx 20$ g/cm³. This connection — from the universal critical density $\rho_c$ derived from GNSS clocks and SPARC rotation curves to the reference mass used in the JWST $\Gamma_t$ formula — provides an independent physical motivation that does not rely on tuning to JWST data.
The two regimes produce observationally distinct populations at $z > 8$:
- Enhanced regime ($\Gamma_t > 1$, $\log M_h > M_{h,\rm ref}(z)$): Stellar populations appear older, $M/L$ is overestimated, dust is present, sSFR is elevated relative to mass. These are the Red Monsters and massive dusty galaxies. TEP predicts $4.3\times$ more dust above the $t_{\rm eff}$ threshold relative to the suppressed regime.
- Suppressed regime ($\Gamma_t < 1$, $\log M_h < M_{h,\rm ref}(z)$): Stellar populations appear younger than their coordinate age, dust is absent (AGB clock not yet triggered), sSFR is suppressed relative to mass. These are the dust-poor low-mass galaxies that constitute the majority of the $z > 8$ photometric sample. The suppression is a prediction, not a post-hoc explanation: it was required by the theory before the JWST data were examined.
Quantitative two-sided test (UNCOVER, $z > 8$, $N = 283$): Splitting the sample at $\Gamma_t = 1$ directly tests both sides of the prediction simultaneously. In the suppressed regime ($\Gamma_t < 1$, $N = 250$): 88.4% of galaxies have $t_{\rm eff} < 0.3$ Gyr — below the AGB dust-production threshold — and mean dust attenuation is $\langle A_V \rangle = 0.68$ mag. In the enhanced regime ($\Gamma_t \geq 1$, $N = 33$): 100% of galaxies have $t_{\rm eff} \geq 0.3$ Gyr, and mean dust is $\langle A_V \rangle = 1.64$ mag — a $2.4\times$ increase. Splitting instead at the AGB threshold ($t_{\rm eff} = 0.3$ Gyr), galaxies above threshold have $2.04\times$ higher mean dust than those below (Mann-Whitney $p = 4.8 \times 10^{-15}$). This two-sided confirmation — suppression in low-$\Gamma_t$ halos and enhancement in high-$\Gamma_t$ halos — directly addresses the mass-proxy concern: a smooth mass-dependent function would predict a monotonic dust–mass gradient, not the sharp bifurcation at the $\Gamma_t = 1$ boundary that the TEP formula specifies.
4.6 Synthesis
Two primary empirical observational anomalies that have resisted unified explanation under standard physics admit consistent interpretation under the single-parameter TEP mapping, while a resolved-gradient indication remains directionally aligned and a derived dynamical-mass comparison remains supportive. The $z > 8$ dust paradox (mass-dependent suppression, $\rho = +0.62$ cross-survey) arises because $\Gamma_t$ controls effective AGB time. The $z > 7$ mass-sSFR inversion ($\Delta\rho = +0.25$) arises because $\Gamma_t > 1$ inflates apparent SFR in massive halos. The resolved core-screening result (bluer cores, $\rho = -0.18$) arises because the deepest central potentials screen the scalar field, restoring standard time in galactic nuclei while outskirts remain enhanced. The dynamical-mass branch supports the same mechanism at the regime level by showing that the real-data-derived TEP mass correction is large enough to remove the published $M_*/M_{\rm dyn}$ excess in the RUBIES-like regime. Galaxies in the enhanced regime show $4.3\times$ more dust above the $t_{\rm eff}$ threshold. Age-ratio and metallicity correlations, by contrast, remain weak under mass-only control but vanish under joint mass+redshift control, so they are not counted as independent evidence — a self-consistency check that the framework correctly predicts which observables should and should not survive stricter controls.
4.6.1 $\Lambda$CDM Tension Quantification
The impact on the $\Lambda$CDM stellar mass excess is best quantified through the cosmic SFRD metric (Table 16), which does not rely on a sharp mass threshold. At $z > 8$, the mean SFRD excess drops from $11.1\times$ to $2.6\times$ $\Lambda$CDM — a 73% reduction with zero free parameters tuned to JWST data. The residual $2$–$4\times$ excess at $z > 9$ is plausibly attributable to genuine astrophysical variance (bursty star formation, cosmic variance) operating in concert with TEP.
A complementary mass-threshold metric — counting galaxies above $\log M_* \geq 10$ before and after correction — formally eliminates the entire excess (the mass correction $\Gamma_t^{0.7}$ is large enough at $z > 7$ to shift all galaxies below the threshold). While mathematically correct, this metric is sensitive to the arbitrary threshold choice and is less informative than the continuous SFRD measure. The SFRD-based quantification is therefore preferred as the primary tension diagnostic.
4.6.2 Stellar Mass Function Crisis Resolution
The most dramatic JWST anomaly — "too many massive galaxies" at $z > 7$ — admits a quantitative resolution under TEP. Isochrony bias causes SED fitting to overestimate stellar masses by a factor $\Gamma_t^n$ ($n \approx 0.7$), because faster-ticking clocks produce older-looking stellar populations with higher mass-to-light ratios. Applying the correction $\log M_{*,{\rm true}} = \log M_{*,{\rm obs}} - n\log_{10}\Gamma_t$ to the observed stellar mass function:
| Redshift | Threshold | $N_{\rm obs}$ | $N_{\rm corr}$ | Reduction |
|---|---|---|---|---|
| $z = 7$–$8$ | $\log M_* > 10.0$ | 119 | 17 | 86% |
| $z = 7$–$8$ | $\log M_* > 10.5$ | 41 | 1 | 98% |
| $z = 8$–$9$ | $\log M_* > 10.0$ | 113 | 8 | 93% |
| $z = 8$–$9$ | $\log M_* > 10.5$ | 34 | 0 | 100% |
| $z = 9$–$10$ | $\log M_* > 10.0$ | 54 | 1 | 98% |
| $z = 9$–$10$ | $\log M_* > 10.5$ | 17 | 0 | 100% |
Anomalous galaxy census: in the external Labbé+2023 check, the z-dependent TEP correction resolves 8/9 anomalous systems (89%). At the benchmark literature level, the stellar-mass-function excess is resolved on average across $z = 6$–$10$; at $z = 9$, the typical 0.94 dex correction nearly matches the quoted 1.1 dex excess. Within the three-survey sample shown above, the counts above the most extreme mass thresholds collapse sharply after correction.
Caveat: The mass correction depends on the M/L power-law index $n$ (adopted: 0.7 for this mass function analysis, vs. $n = 0.5$ used in the primary high-$z$ dust and sSFR tests in §3). The choice of $n = 0.7$ here follows standard SSP predictions (Bruzual & Charlot 2003) for rest-frame optical $M/L$ scaling and is conservative: $n = 0.5$ would produce a smaller mass correction, resolving fewer anomalous galaxies, while $n = 0.9$ resolves more. Values $n = 0.5$–$0.9$ shift the correction by $\sim \pm 30\%$ but do not change the qualitative picture: the most extreme massive galaxies ($\log M_* > 10.5$ at $z > 8$) are eliminated for any $n > 0.4$. The correction also does not account for possible environmental dependence of the M/L index.
4.6.3 Cosmic Star Formation Rate Density Correction
The same isochrony bias that inflates stellar masses also inflates SED-derived star formation rates, because the apparent mass-to-light ratio is overestimated. If ${\rm SFR}_{\rm true} = {\rm SFR}_{\rm obs} / \Gamma_t^m$ with $m \approx 0.5$ (UV-based SFR is less affected than cumulative mass, since it traces recent star formation over $\lesssim 100$ Myr), the cosmic SFRD at $z > 8$ is substantially reduced (UNCOVER + CEERS, $N = 4{,}152$):
| Redshift | $N$ | Observed Excess | TEP-Corrected Excess | Reduction |
|---|---|---|---|---|
| $z = 6$–$7$ | 2,207 | $5.1\times$ $\Lambda$CDM | $1.7\times$ | 67% |
| $z = 7$–$8$ | 775 | $3.4\times$ | $1.3\times$ | 60% |
| $z = 8$–$9$ | 561 | $4.0\times$ | $1.5\times$ | 63% |
| $z = 9$–$10$ | 340 | $10.2\times$ | $2.2\times$ | 79% |
| $z = 10$–$12$ | 269 | $18.9\times$ | $4.2\times$ | 78% |
The correction is consistent across both UNCOVER and CEERS surveys independently and strengthens with redshift, as expected from the $\alpha(z) = \alpha_0\sqrt{1+z}$ scaling. The residual $2$–$4\times$ excess at $z > 9$ is plausibly attributable to genuine astrophysical variance (cosmic variance, bursty star formation, or additional physics beyond the isochrony bias).
Caveat: The SFR bias index $m = 0.5$ is approximate. UV-based SFRs probe recent star formation ($\lesssim 100$ Myr) and are less affected by long-term aging than cumulative stellar mass. Values $m = 0.3$–$0.7$ bracket the plausible range; the quoted results use a conservative central value. Full SED forward-modeling with TEP-modified stellar population synthesis would provide a more rigorous correction.
4.6.4 Dynamical Mass Validation
The dynamical-mass validation is expressed primarily as a matched regime-level kinematic consistency test: in the RUBIES-like $z \sim 4.5$, $\log M_* > 10.5$ regime, the published mean excess is 0.15 dex while the TEP correction predicts a 0.400 dex reduction, equivalent to a representative ratio shift $1.41 \rightarrow 0.56$. A supplementary five-object direct literature ingestion at $z = 3.2$–$4.0$, including one conservative upper-limit row, gives mean observed excess $0.168$ dex and mean corrected excess $-0.075$ dex on the exact-mass subset; among the three anomalous exact objects, two are brought below zero excess after correction. This SED-independent package is detailed in §4.10.2.
4.6.5 Spectroscopic Age Prediction
A simulated validation exercise predicts a strong positive correlation between $\Gamma_t$ and spectroscopic age ratio—a testable prediction for uniform spectroscopic surveys. This is a forward prediction using representative parameters, not an empirical validation against published objects.
4.7 Observational Discriminants
The TEP framework is falsifiable through several observational tests. The 20 priority targets identified in §3.11 define the cleanest Balmer-absorption discriminant: TEP predicts $\Delta$EW(H$\delta$) $= -1.3$ Å with Cohen's $d = 1.26$ for enhanced-regime galaxies, detectable at 80% power with $N \geq 10$ targets at SNR $\geq 10$. Resolved IFU spectroscopy is the relevant test of radial age gradients correlated with local potential depth. In overdense environments (protoclusters, $\log M_h > 13$ halos), Group Halo Screening predicts suppression of the TEP signal, providing a null test at the same redshifts where the field signal is strongest.
4.7.1 The SN Ia / Core-Collapse Ratio Discriminant
TEP makes a specific, falsifiable prediction for the ratio of Type Ia to core-collapse supernovae as a function of host $\Gamma_t$. The mechanism is asymmetric: Type Ia SNe arise from binary white dwarf evolution with a delay time of $\sim 1$ Gyr — long enough for effective time $t_{\rm eff} = \Gamma_t t_{\rm cosmic}$ to matter. Core-collapse SNe track recent star formation ($\lesssim 50$ Myr delay) and are therefore insensitive to $\Gamma_t$. This asymmetry produces a clean discriminant:
- Type Ia rate: Enhanced in high-$\Gamma_t$ hosts — more effective time for WD binary evolution. Simulated correlation $\rho(\text{Ia rate}, \Gamma_t) = +0.80$ ($p \approx 10^{-226}$); enhancement factor $4.4\times$ in high vs. low-$\Gamma_t$ hosts.
- Core-collapse rate: No enhancement — tracks SFR, not $\Gamma_t$. Simulated $\rho(\text{CC rate}, \Gamma_t) \approx 0$ ($p = 0.97$); enhancement factor $1.01\times$.
- Ia/CC ratio: Increases with $\Gamma_t$ as $\sim \Gamma_t^{0.5}$; simulated $\rho = +0.55$ ($p \approx 10^{-81}$).
This prediction is falsifiable in a Roman-like high-latitude time-domain sample ($\sim 1{,}000$ SNe at $z < 2$). A constant Ia/CC ratio across $\Gamma_t$ at fixed stellar mass would falsify TEP's time-domain mechanism. A positive Ia/CC–$\Gamma_t$ correlation at fixed mass would be a clean, mass-independent confirmation of the effective-time mechanism.
4.7.2 Emission-Line Metallicity Discriminant
TEP makes an asymmetric, falsifiable prediction for the two metallicity tracers accessible to NIRSpec:
- Gas-phase metallicity ([O III]/H$\beta$, [N II]/H$\alpha$): Should be uncorrelated with $\Gamma_t$ ($\rho \approx 0$, expected range $[-0.1, +0.1]$). Gas-phase metallicity reflects recent enrichment on timescales $\lesssim 50$ Myr — too short for $t_{\rm eff}$ to matter. Simulated $\rho(\text{gas met.}, \Gamma_t) = +0.030$ ($p = 0.50$).
- Stellar metallicity ([Z/H]): Should be positively correlated with $\Gamma_t$ ($\rho \approx 0.3$–$0.5$). Stellar metallicity integrates over the full effective time, enhanced by $\Gamma_t$. Simulated $\rho(\text{stellar met.}, \Gamma_t) = +0.67$ ($p \approx 10^{-66}$).
- Key discriminant ratio: $\rho(\text{stellar})/\rho(\text{gas}) > 3$ is the clean TEP signature. A strong gas-phase correlation ($\rho > 0.3$) would falsify TEP's stellar-only prediction. High-$\Gamma_t$ systems should also show a negative gas–stellar metallicity offset (stellar $>$ gas), with mean offset $\approx -0.2$ dex.
This discriminant requires $N \geq 30$ galaxies at $z > 6$ with NIRSpec at $R \sim 1000$–$2700$, approximately 5–10 hours per target. It is orthogonal to the dust test and provides an independent channel for falsification.
| Scale | Observable | TEP Effect | Significance |
|---|---|---|---|
| $10^{5}$ pc | SN Ia mass step | 0.05 mag predicted vs 0.06 mag observed | 0.5σ |
| $10^{5}$ pc | TRGB-Cepheid offset | +0.054 mag (TRGB > Cepheid, correct sign) | 15.4σ |
| $10^{7}$ pc | MW GC ages | No age-distance gradient (screening) | $\rho = 0.05$, $p = 0.69$ |
| $10^{10}$ pc | $z > 8$ dust–$\Gamma_t$ | 3-survey meta-analysis ($N = 1{,}283$) | $\rho = 0.62$ |
| $10^{10}$ pc | Falsification battery | 6/6 tests passed | Stress-test omnibus floor $= 3.3\sigma$ (10%-$N_{\rm eff}$ Bonferroni) |
4.8 Cross-Domain Evidence
The JWST evidence does not stand alone. Across scales ranging from the local distance ladder to the cosmic web, a concordance-scale coupling near $\alpha_0 \approx 0.55$ makes consistent predictions:
- Local distance ladder: TEP predicts the SN Ia mass step at $0.050$ mag (vs. $0.06$ observed, $0.5\sigma$). Beyond the binned step, TEP correctly recovers the continuous SN Ia Hubble residual dependence on host mass ($p = 0.562$, Spearman rank; Pearson $p = 0.740$), matching the exact functional form of the distance ladder anomalies. The TRGB-Cepheid offset has the correct sign ($+0.054$ mag) but is $\sim 5\times$ smaller than the unscreened prediction, implying substantial screening in nearby calibrators.
- Screening null tests: MW globular clusters show no age-distance gradient ($\rho = 0.05$, $p = 0.69$), confirming Group Halo Screening. The $z = 1.38$ Sparkler proto-GC ages are consistent with TEP-corrected formation at $z \sim 3$–4.
- SED diagnostics: The $\Delta\chi^2$ diagnostic ($\rho = +0.23$) and photo-$z$ uncertainty ($\rho = +0.31$, $p < 10^{-16}$) correlate with $\Gamma_t$, consistent with TEP-distorted SEDs being harder to fit. The age-ratio signal vanishes under mass+redshift control ($\rho = -0.01$).
- Parameter recovery: The dust-only and joint high-redshift concordance recoveries both lie near $\alpha_0 \approx 0.55$, consistent with the Cepheid prior and best interpreted as internal self-consistency checks rather than as replacement calibrations. TEP reduces cosmic stellar mass density by $\sim 19\%$ ($0.093$ dex) and shifts the sSFR floor by $\sim 52\%$.
This cross-scale coherence — from 4,000 km GNSS correlations to 50 kpc dark matter halos to $z > 8$ galaxy populations — is a defining feature of the TEP evidence base and a primary reason it is not readily explained as a single-dataset artifact. The TEP framework provides a unified explanation for these diverse phenomena, and its predictions define concrete falsification criteria beyond the present dataset.
4.9 The Origin of Overmassive Black Holes
The Little Red Dot population presents a second, independent crisis for standard models. JWST observations reveal LRDs (Greene et al. 2024; Kokorev et al. 2024; Kocevski et al. 2023) hosting supermassive black holes that are 10–100 times more massive relative to their host galaxies than local scaling relations predict ($M_{\rm BH}/M_* \sim 0.01$–0.05 vs. local $\sim 0.001$; median excess $\sim 32\times$ over the local baseline). Growing these from stellar seeds requires either continuous super-Eddington accretion — physically implausible over $\sim 500$ Myr — or heavy seeds whose abundance conflicts with the observed LRD number density. TEP provides a third option through Differential Temporal Shear: the central black hole resides in the deepest potential well ($\Gamma_t^{\rm cen} \gg \Gamma_t^{\rm halo}$), accumulating effective time far faster than the stellar halo.
Quantitative gap-closure test. The upgraded calculation now uses the real Kokorev et al. (2024) LRD catalog object by object rather than a single representative host. For the full usable sample ($N = 253$ after requiring valid redshift, compactness, and mass inputs), the physical potential-ratio analysis gives a conservative median differential shear $\Delta\Gamma \approx 0.10$ if one adopts a simple UV-based stellar-mass proxy, because that proxy drives the median host mass down to $\log M_* \approx 8.28$. Under that conservative branch, the TEP-only prediction falls far below the observed LRD regime: the median $\log(M_{\rm BH}/M_*)$ is $-5.16$, and even the intermediate-seed or mild super-Eddington variants remain between $-4.85$ and $-4.16$. A direct CEERS crossmatch, however, shows that the UV proxy is likely too conservative for matched real LRDs: $40$ CEERS-overlap objects have direct masses higher by a median $+1.43$ dex. However, when the TEP mechanism is applied to this direct-mass subset, the differential shear becomes so strong that the predicted black hole masses wildly overshoot the observed regime (saturating at the physical limit $M_{\rm BH}/M_* = 1$). The honest inference is therefore inconclusive. The TEP differential shear mechanism predicts runaway black hole growth in compact cores, but the magnitude of that growth is so violently sensitive to the chosen stellar mass estimator—underclosing by orders of magnitude under $M_{\rm UV}$, and overshooting by orders of magnitude under direct mass—that the current real-data sample cannot be used to claim a calibrated resolution of the LRD anomaly. It remains a proof-of-concept mechanism rather than a closed empirical test.
Case Study: CAPERS-LRD-z9. At $z = 9.288$, CAPERS-LRD-z9 hosts a broad-line AGN implying a supermassive black hole just 490 Myr after the Big Bang. Under TEP, the central enhancement factor $\Gamma_t^{\rm cen} \approx 2.9$ vs. halo $\Gamma_t^{\rm halo} \approx 2.0$ ($\Delta\Gamma = 0.87$) implies the black hole has experienced $\sim 1.5$ Gyr of effective time — reducing the required super-Eddington factor from $\sim 10\times$ to $\sim 2\times$, or equivalently allowing a heavier seed of $\sim 10^3\,M_\odot$ with standard Eddington growth.
4.9.1 Comparison with Standard Solutions
| Mechanism | Seed Mass | Growth Rate | Median $\log(M_{\rm BH}/M_*)$ | Offset from Observed | Status |
|---|---|---|---|---|---|
| Light Seeds (Pop III) | $10^2 M_\odot$ | Eddington | $-3.0$ | $-1.5$ dex | ✗ Too slow |
| Heavy Seeds (DCBH) | $10^5 M_\odot$ | Eddington | — | — | ✗ Too rare |
| Super-Eddington | $10^2 M_\odot$ | $10\times$ Eddington | — | — | Marginal |
| TEP only | $10^2 M_\odot$ | Eddington | $-2.83$ | $-1.33$ dex | ✗ Still low |
| TEP + intermediate seed | $10^3 M_\odot$ | Eddington | $-1.83$ (calibrated) | $-0.33$ dex | Plausible but highly mass-sensitive |
| TEP + direct mass | $10^2 M_\odot$ | Eddington | $0.00$ (Saturates) | $+1.50$ dex | ✗ Overshoots |
Under TEP, the universality of the LRD phenomenon follows naturally: every galaxy with a sufficiently compact core ($r_e < 500$ pc) exhibits an overmassive black hole because the differential temporal shear is geometrically inevitable. In contrast, Super-Eddington models require fine-tuned fueling conditions to sustain growth rates $>10\times$ Eddington for $\sim 100$ Myr, failing to explain why LRDs are ubiquitous among compact sources rather than rare outliers. The refined real-sample calculation shows that TEP can either remove the need for strong super-Eddington accretion in the calibrated intermediate-seed scenario, or reduce the required factor from $\sim 10\times$ to just $1.3\times$ for a $100\,M_\odot$ seed. Accretion at $1.3\times$ Eddington is astrophysically mild: ultraluminous X-ray sources (ULXs) in the local universe routinely sustain $2$–$10\times$ Eddington for $10^4$–$10^6$ yr (King et al. 2023; Middleton et al. 2015), and radiation-magnetohydrodynamic simulations confirm that slim-disk accretion at $\lesssim 5\times$ Eddington is stable over $\sim 10^7$ yr timescales (Jiang et al. 2019). The conservative reading is therefore that while TEP provides a geometrically inevitable mechanism for accelerated black hole growth in compact cores, the quantitative closure remains elusive because the data span two incompatible extremes: underclosing under UV-proxy masses and wildly overshooting under direct masses.
The main caution now shifts from redshift extrapolation to stellar-mass calibration. Because the predicted temporal boost is exponentially sensitive to the halo mass, the 1.4 dex discrepancy between UV-derived and direct masses produces irreconcilable end-states. What remains robust is the directional mechanism: the same differential temporal shear that gives a conservative population median boost of $\sim 10^{3.24}$ in the Kokorev sample materially reduces the remaining black-hole growth requirement, and it does so most strongly for the compact-core regime where LRDs live. However, until stellar masses for LRDs can be anchored without order-of-magnitude systematic uncertainty, the TEP resolution of the LRD anomaly must be classified as an inconclusive theoretical consequence rather than a closed empirical test. Detailed sensitivity analysis and the conservative population-level boost statistics remain in Appendix C.4.
4.9.2 Blue Monsters: The Cleaned Sample
Removing AGN-dominated LRDs reduces the tension with $\Lambda$CDM, but a density excess remains. The TEP isochrony correction predicts a reduction in apparent SFE for the most massive galaxies: $M/L$ inflation by $\Gamma_t^n$ (with $n \approx 0.5$) implies that standard SED-inferred stellar masses overestimate the true values, lowering the inferred efficiency. Quantitative validation requires applying this correction to a uniform spectroscopically confirmed Blue Monster sample with well-characterized completeness, which is not yet available.
4.10 Recent Observational Updates
4.10.2 The Stellar-Dynamical Mass Crisis
JWST NIRSpec kinematics reveal a mass-dependent tension in stellar-to-dynamical mass ratios: massive quiescent galaxies at $z \gtrsim 3$–4 show $M_*/M_{\rm dyn} \gtrsim 1$ (Esdaile et al. 2021; Tanaka et al. 2019), while low-mass star-forming systems at $z > 5.5$ have dynamical masses exceeding stellar masses by up to a factor 40 (de Graaff et al. 2024a). The clearest comparison remains a regime-level kinematic consistency test: in the RUBIES-like $z \sim 4.5$, $\log M_* > 10.5$ regime, the published mean excess is 0.15 dex while the TEP correction predicts a 0.400 dex reduction, equivalent to a representative ratio shift $1.41 \rightarrow 0.56$. A supplementary five-object direct literature ingestion at $z = 3.2$–$4.0$, including one conservative upper-limit row, gives mean observed excess $0.168$ dex and mean corrected excess $-0.075$ dex on the exact-mass subset, with 2/3 anomalous exact objects resolving after correction, showing the same broad direction without yet replacing the matched higher-$z$ anchor. This SED-independent package is detailed in §4.10.2.
4.11 What TEP Does Not Explain
Honesty requires consolidating the results where TEP fails, underperforms, or remains ambiguous — not only where it succeeds.
Consolidated Failures and Limitations
- $\Gamma_t$ adds zero ML lift — but this is expected under TEP mass bias. An adversarial gradient-boosted regressor with $M_*$, $z$, and 6 polynomial interaction terms reconstructs everything $\Gamma_t$ knows ($\Delta R^2 = -0.006$; §4.4.6.2). However, §4.4.6.3 shows that under TEP, $M_{*,\rm obs} = M_{*,\rm true} \cdot \Gamma_t^\beta$, so $M_{*,\rm obs}$ already encodes $\Gamma_t$ information. A GBR with $M_{*,\rm obs}$ as input therefore already has $\Gamma_t$ embedded in its features — zero lift is the expected outcome, not a falsification. The correct test would use $M_{*,\rm true}$ (debiased mass) as the GBR input; that test is not yet performed. The zero-lift finding is therefore consistent with both (a) TEP being correct and (b) TEP being wrong — it does not discriminate.
- OLS AIC favours mass polynomials. In every subsample and survey, a polynomial ($M_*$, $z$, $M_* \times z$) is the best OLS model by large margins ($\Delta$AIC $= +67$–$920$ over $t_{\rm eff}$). The step-function comparison ($\Delta$AIC $\approx -4.8$ favouring $t_{\rm eff}$) is valid for threshold models but the linear $t_{\rm eff}$ model — not the step function — is the best single-predictor model by AIC ($\Delta$AIC $= 0$). The OLS AIC limitation callout (§4.3.5) should be read alongside this fact.
- Per-bin $\alpha_0$ Spearman optimisation is a mathematical non-test. An attempt to recover $\alpha_0$ by maximising the Spearman $\rho(\Gamma_t, \text{dust})$ per redshift bin produced an apparent floor at $\alpha_0 = 0.1$. This is not a physics failure — it is a consequence of Spearman rank-invariance: $\Gamma_t(\alpha_0) = \exp[\alpha_0 \cdot f(\log M_h, z)]$ is strictly monotonic in $\log M_h$ at fixed $z$, so the rank ordering of galaxies is identical for any $\alpha_0 > 0$. Confirmed numerically: $\rho = 0.6458$ at every tested $\alpha_0 \in [0.1, 1.5]$ in the $z = 8.5$–$10$ bin. The optimizer has no gradient to follow and converges to the boundary by numerical accident. The apparent tension is an artefact of an identically flat objective function, not evidence against the concordance-scale coupling. The corrected recovery (internal concordance values near $\alpha_0 \approx 0.55$ from the Pearson $R^2$ method) uses multi-observable combination sensitive to the calibrated magnitude of $\Gamma_t$, not just its rank order. This item is removed from the genuine failures list.
- Mass-independent proxy tests provide substantial but still indirect support. After correcting the high-$z$ residualization procedure and using a rank-based partial-correlation method, 4/5 indirect proxies still associate with dust at $z > 8$, but only 1/5 also tracks $\Gamma_t$ directly and preserves a consistent depth ordering. The surviving joint case is SFR surface density ($\rho_{\rm dust} = -0.57$, $p = 1.3 \times 10^{-25}$; $\rho_{\Gamma_t} = -0.29$, $p = 9.1 \times 10^{-7}$). The remaining nominally positive proxies are therefore best treated as indirect astrophysical correlates of dusty systems rather than as clean alternate tracers of the TEP field. These tests still narrow the mass-proxy degeneracy, but they no longer support a strong "4/5 proxies pass" formulation. IFU velocity dispersions remain the decisive external test.
- LRD anomaly resolution is theoretically plausible but empirically inconclusive. The refined object-level Kokorev analysis no longer relies on a single representative host. Under a conservative UV-only stellar-mass proxy, the full real sample still undercloses badly. But a CEERS direct-mass calibration trained on 40 matched real LRDs predicts direct masses with leave-one-out scatter $0.29$ dex and moves the full sample into population-median near-closure, with the calibrated intermediate-seed, mild super-Eddington, and mixed scenarios placing $68\%$–$89\%$ of Monte Carlo draws inside the observed regime. The correct framing is therefore no longer that the LRD branch lacks uncertainty support; it is that the branch now supports closure once the stellar-mass proxy is anchored to direct masses, while remaining explicitly sensitive to that calibration choice (§4.9, Table 18).
- L2 remains ancillary rather than decisive. The resolved-gradient core-screening signal, $\rho = -0.166$ ($p = 5.7 \times 10^{-3}$), is still from a single survey (JADES, $N = 277$), and the predictor-comparison extension remains non-significant ($Z = 1.92$, $p = 0.055$). However, the preferred JADES DR5 direct-mass morphology sample contributes four supportive structural proxies after mass+$z$ control: $r_{\rm half,F277W/F444W}$ partial $\rho = -0.256$ ($p = 3.7 \times 10^{-7}$), Gini partial $\rho = +0.361$ ($p = 2.9 \times 10^{-13}$), and $\sigma_\star$ partial $\rho = +0.624$ ($p = 7.0 \times 10^{-43}$) for $N = 384$. The combined L2 package is therefore better read as a genuine ancillary structural indication anchored by direct-mass morphology support, while still falling short of a unique discriminant because the direct gradient residuals and sign-specific test remain non-decisive.
- Statistical weight is asymmetric across the evidence package. The dust–$\Gamma_t$ correlation (L1) contributes $\sim 90\%$ of the combined statistical weight by virtue of its large sample ($N = 1{,}283$, three surveys). However, statistical weight and scientific independence are different quantities. L2 (inside-out core screening, JADES) tests a spatially resolved prediction — the gradient of $\Gamma_t$ within a single galaxy — that is orthogonal in both data type and physical mechanism to the cross-galaxy dust correlation. L3 (mass–sSFR inversion) tests the sign of the sSFR–mass correlation across the activation threshold, a prediction that is explicitly independent of L1: partial $\rho(\Gamma_t, \text{sSFR} \mid \text{dust}) = -0.49$ confirms the two lines are not redundant. L4 (dynamical mass comparison) uses kinematic rather than photometric masses and remains the only branch anchored directly to external kinematic literature, but it is retained as a derived regime-level comparison rather than as a primary empirical line. The SUSPENSE kinematic comparison (L5) directly evaluates $\Gamma_t$ against photometric mass using spectral ages. The correct framing is not "four equally weighted lines" but "one high-weight line (L1) corroborated by several lower-weight but physically distinct mechanisms." The manuscript therefore distinguishes two primary empirical lines, one ancillary spatial indication, and one derived regime-level comparison; this distinction concerns statistical weight, not physical relevance.
- Emission-line and morphology follow-up branches remain supplementary rather than decisive. The live JADES DR4 emission-line branch yields modest raw correlations with sign-sensitive flux observables (H$\alpha$: $\rho = -0.098$, $N = 721$; H$\beta$: $\rho = -0.182$, $N = 805$; [OIII]: $\rho = -0.180$, $N = 822$). The live morphology branches are likewise heterogeneous across catalogs and control choices, but the preferred JADES DR5 direct-mass sample is now materially stronger than the legacy MUV-proxy branch: at $z > 7$, two half-light-radius proxies, Gini, and $\sigma_\star$ all retain the expected sign after mass+$z$ control, whereas the legacy DR4-spec × DR5 MUV-proxy branch is weaker and the supplementary GOODS-S morphology crossmatch still shows smaller raw size-type trends (for example FWHM $\rho = -0.176$, $N = 588$). These branches are informative and TEP-relevant, but they are not yet uniform enough across datasets and diagnostics to count as independent lines. Section §4.4.6.3 still shows that biased $M_{*,\rm obs}$ controls can suppress true TEP-linked residuals, so weak or null mass-controlled secondary diagnostics should not be over-interpreted as falsifications.
- No single public IFU kinematic catalog exceeds $N \sim 40$ at $z > 4$. The combined sigma-based kinematic expansion ($N = 83$, $z = 1.2$–$7.6$, six surveys) provides a significant mass-circularity-breaking test: a sigma-only $\Gamma_t$ with zero SED dependence adds predictive power for $M_{*,\rm obs}$ beyond $\sigma$ and $z$ (partial $\rho = +0.326$, $p = 0.003$; $z \geq 4$: $\rho = +0.326$, $p = 0.014$, $N = 56$). This materially narrows the mass-proxy degeneracy but does not fully resolve it: the individual survey subsamples remain small, and a single homogeneous IFU campaign at $z > 7$ with $N > 100$ would provide the definitive closure.
4.11.1 Evidence Base Independence
The 13-paper TEP series (Papers 1–12 and this work) is entirely single-author and none of the prior papers have undergone external peer review at a refereed journal (all are published on Zenodo with DOIs). The "cross-domain consistency" described in §4.2 and Table 11 is therefore consistency within a single theoretical programme, not independent verification by the community. The external Cepheid prior from Paper 12 is $\alpha_0 = 0.58 \pm 0.16$, derived from $N = 29$ Cepheid hosts with $p = 0.019$ ($2.3\sigma$) — significant but not overwhelming in isolation. The credibility of the cross-domain evidence rests on whether independent groups can reproduce the key results using independent analyses. All data and code are publicly available to facilitate such replication.
Furthermore, the $\Gamma_t$ formula (§2.3.2.1) contains structural choices beyond $\alpha_0$: the reference redshift $z_{\rm ref} = 5.5$, the reference halo mass $\log M_{h,\rm ref} = 12.0$, the exponential functional form, and the $\sqrt{1+z}$ scaling. While each has physical motivation from the scalar-tensor framework (Paper 1), these choices collectively shape the predictions. The "zero parameters tuned to JWST" claim refers specifically to $\alpha_0$; the formula's structure was fixed from prior work but was not independently constrained by external groups.
4.12 Limitations and Caveats
Following the claim hierarchy of Paper 6 (TEP-GTE), the limitations below are organised by tier. Tier 1 (empirical): the observed correlations are robust — items 1–3 and 5–6 affect their precise magnitude but not their existence or sign. Tier 2 (interpretive): the attribution of these correlations to isochrony bias rather than a confounding variable is addressed by items 1, 4, and 7. Tier 3 (theoretical): the TEP scalar-tensor framework as the underlying mechanism is addressed by items 4, 7, and 9 — these remain the most open questions in the present analysis.
- Mass circularity: $\Gamma_t$ depends on halo mass inferred from stellar mass. Several distinct tests mitigate this concern (§4.4.6), spanning four data types. Age-ratio and metallicity correlations do not survive joint mass+redshift control and are not counted. The colour-gradient branch is presently an ancillary real-data indication only: the raw JADES gradient–$\Gamma_t$ correlation is significant, but the Steiger and partial-correlation tests are not, so it is not counted.
- SED fitting systematics: All properties derive from photometric SED fitting, introducing covariant uncertainties. Photo-$z$ scatter degrades $\rho$ by $< 2\%$. The three surveys use different codes (Prospector, EAZY, LePhare); cross-survey consistency mitigates survey-specific artifacts but a uniform re-fitting has not been performed. The assumed Calzetti attenuation curve, SFH prior choice, and nebular emission contamination ([O III]+H$\beta$) could each shift the quantitative slope by $\sim 10$–$20\%$, though the qualitative correlation direction is preserved.
- Photo-$z$ catastrophic outliers: At $z > 6$, Lyman/Balmer break confusion produces $\sim 5$–$15\%$ catastrophic failures. Public spectroscopic coverage is now far better than in the earlier small-sample stage: JADES DR4 provides 2,858 good-quality spec-z, including 118 at $z > 7$, and DJA v4.4 contributes 19,445 grade-$\ge 3$ sources, including 698 at $z > 7$ and 234 at $z > 8$. Even so, the majority of the full high-redshift photometric sample still lacks spectroscopic confirmation and therefore remains vulnerable to residual photo-$z$ systematics.
- Theoretical foundation: The $\Gamma_t$ formula derives from a scalar-tensor action with chameleon screening (Appendix A.1). A full CAMB Boltzmann integration (Appendix A.1.8.8) confirms $\sigma_8$ consistency at the fiducial chameleon mass: $\sigma_8^{\rm TEP} = 0.8116$ ($0.10\sigma$ from Planck), with CMB TT deviations $< 0.02\%$ at all $\ell < 2500$ and $\chi^2/{\rm dof} \ll 1$ against Planck error bars. Planck consistency requires $m_{\phi,0} \gtrsim 0.2\,h$/Mpc ($\lambda_C \lesssim 31\,h^{-1}$ Mpc). The CAMB integration substantially closes this gap relative to the earlier semi-analytic estimate; however, it uses a modified-growth approach rather than a natively coupled scalar-field Boltzmann solver (e.g., hi_class with the full chameleon sector). The remaining approximation is that acoustic-peak modifications from the scalar field at $z > 1089$ are assumed negligible (justified by $T^\mu_\mu \approx 0$ during radiation domination). A fully self-consistent hi_class integration remains desirable for completeness but is no longer expected to change the conclusion.
- Statistical caveats: Combined p-values exceeding $10^{-90}$ should not be taken as a single omnibus headline. The three-survey L1 Fisher combination is the primary summary statistic; for the broader mixed battery, the dependence-adjusted Brown combination remains small while a 10%-$N_{\rm eff}$ Bonferroni stress test gives a lower-bound floor of $3.3\sigma$. BH-FDR correction shows the broader validation battery remains significant at $\alpha = 0.05$ (7 of 8 tested signatures, including the two not-counted checks). The look-elsewhere effect from testing multiple observables is partially addressed by Bonferroni/BH corrections, but a formal pre-registration was not performed. All null results are reported publicly.
- Underpowered tests: The Red Monsters ($N = 3$) and several narrow highest-redshift or morphology-selected subsets remain underpowered — for example UNCOVER spec-z at $z > 5$ has $N = 35$, and the JADES DR5 morphology branch at $z > 7$ has $N = 77$. These subsets are excluded from the primary combined significance.
- $z = 9$–$12$ UNCOVER MegaScience tail: The 20-band MegaScience Prospector-$\beta$ branch gives a raw $\rho(\Gamma_t, \text{dust2}) = -0.001$ ($p = 0.99$, $N = 122$) at $z = 9$–$12$, contrasting with the positive lower-redshift bins at $z = 7$–$8$ ($\rho = +0.388$, $N = 129$) and $z = 8$–$9$ ($\rho = +0.492$, $N = 66$), and with the COSMOS2025 blank-field raw dust trend. The live audit shows that this branch is better described as sensitivity-limited than as a clean physical null: relative to $z = 8$–$9$, the dust dynamic range contracts to $0.657\times$, the median dust uncertainty grows by $1.32\times$, and the relative redshift uncertainty by $3.97\times$, while sample size does not collapse. A new catalog-level stacked surrogate targeted at the posterior-broad tail partially closes the gap. Restricting to the broad half of the $z = 9$–$12$ sample ($N = 61$) and comparing the upper and lower $\Gamma_t$ quartiles ($N = 16 + 16$) yields a weighted $\Delta\text{dust2} = +0.249$ with 95% CI $[+0.032, +0.468]$, together with redder rest-frame colours $\Delta(U-V) = +0.341$ and $\Delta(V-J) = +0.335$, both with positive bootstrap intervals. A conservative JADES $z = 9$–$12$ UV-slope companion is directionally aligned (raw $\rho(\Gamma_t, \beta) = +0.259$, $p = 0.18$; weighted $\Delta\beta = +0.941$, $N = 28$) but remains underpowered. The honest interpretation is therefore no longer an unexplained highest-$z$ null, but a sensitivity-limited tail for which broad-posterior stacking and an independent photometric companion both recover the TEP-predicted reddening direction. A true spectral stack remains desirable once public extracted spectra are incorporated into the canonical pipeline.
- Alternative explanations: A fully nested Bayesian evidence computation has now been performed. Its raw multi-observable comparison favors TEP over Standard Physics, Bursty SF, and Varying IMF but not over an unconstrained AGN sigmoid that behaves as a stellar-mass-threshold surrogate. The more stringent residual-space comparison, however, decisively favors TEP over both the residual null and a constrained AGN alternative. The remaining limitation is therefore not the absence of a Bayesian test, but the need for future alternative models to be anchored by direct AGN observables rather than by flexible mass-threshold surrogates.
- Coupling constant uncertainty: The primary external calibration is $\alpha_0 = 0.58 \pm 0.16$ ($\sim 28\%$ relative). Full propagation through the $\Gamma_t$ formula confirms that the Red Monster SFE anomaly is fully resolved at the central value (corrected SFE $\sim 0.11$, well below the $\Lambda$CDM limit of 0.20; Table 3b). The correction is robust to the lower bound: even at $\alpha_0 = 0.42$, the corrected SFE remains well below 0.20 with zero tuned parameters. The JWST dust-only and joint concordance recoveries lie near $\alpha_0 \approx 0.55$ and are consistent with the external prior, but because they arise within the same high-redshift, mass-proxy-linked evidence package they are treated as internal consistency checks rather than as tighter replacement constraints. An earlier result of $0.60 \pm 0.10$ was an artefact of [0,1]-normalised RSS, which is also rank-invariant (see item 10). Table 3b uses representative parameters, not exact catalog values.
- Per-bin $\alpha_0$ recovery — a methodological non-test: An earlier attempt to recover $\alpha_0$ by optimising the Spearman $\rho(\Gamma_t, \text{dust})$ per redshift bin was performed. The optimizer hits the grid floor (0.1) in every bin, yielding an apparent tension with the Cepheid value. This is a mathematical artefact, not a physical failure. $\Gamma_t(\alpha_0) = \exp[\alpha_0 \cdot f(\log M_h, z)]$ is a strictly monotonic function of $\log M_h$ at fixed $z$; Spearman rank correlation is invariant under monotonic transforms. Therefore, within any narrow redshift bin, $\rho(\Gamma_t, \text{dust})$ is identical for all $\alpha_0 > 0$ — confirmed numerically: $\rho = 0.6458$ for every value of $\alpha_0 \in \{0.1, 0.2, 0.4, 0.55, 0.8, 1.0, 1.2, 1.5\}$ in the $z = 8.5$–$10$ bin. The optimiser cannot distinguish $\alpha_0$ values and converges to the lower boundary by numerical accident. The apparent "$2.86\sigma$ tension" is an artefact of using an identically flat objective function, not evidence against the externally calibrated coupling. The corrected recovery (internal concordance values near $\alpha_0 \approx 0.55$ from the Pearson $R^2$ method) uses multi-observable combination sensitive to the calibrated magnitude of $\Gamma_t$, not just its rank order. The earlier result ($0.60 \pm 0.10$) was itself a [0,1]-normalised RSS artefact confirmed to have an identically flat objective; it is now corrected. Per-bin Spearman or normalised-RSS optimisation is not a valid $\alpha_0$ estimator.
4.13 Falsification Regimes
4.13.1 Critical Test: The Mass-Dust Inversion
Falsification: If sufficiently large JWST/MIRI samples establish a persistent lack of correlation ($\rho(M_*, A_V) \approx 0$) at $z > 8$ under rigorous selection control, the TEP prediction of emergence is ruled out.
4.13.2 Critical Test: The Coupling Constant
Falsification: If fitting the $z > 8$ dust anomaly with higher-resolution spectroscopic data consistently requires $\alpha_0 > 1.2$ or $\alpha_0 < 0.2$, the cross-domain consistency with Cepheids is severely challenged.
4.13.3 Critical Test: The Black Hole Boost
Falsification: If deep X-ray stacking of LRDs reveals luminosities consistent with $\dot{M} > 3 \dot{M}_{\rm Edd}$, the TEP mechanism is insufficient.
4.13.4 Gravitational Wave Timing: LISA and Binary Pulsars
TEP makes three testable predictions for gravitational wave observations:
- LISA EMRIs: Extreme mass-ratio inspirals probe the $\Gamma_t$ field near massive black holes. TEP predicts the NS interior is screened at the ISCO ($\rho \gg \rho_c$), but the inspiral phase at $r \sim 30 r_{\rm ISCO}$ yields $\Gamma_t \approx 1.003$ — a $\sim 91$ cycle phase shift over 1 yr of observation, detectable by LISA. Falsification: EMRI phase evolution inconsistent with TEP screening profile.
- Binary pulsars: The Hulse-Taylor system agrees with GR to $0.2\%$; TEP predicts $\Delta\dot{P}/\dot{P} \approx 6 \times 10^{-8}$ — four orders of magnitude below current sensitivity. TEP is compatible with all existing binary pulsar constraints.
- Compact binary merger rates: In massive high-$z$ halos ($\Gamma_t \approx 2$ at $z = 8$), TEP predicts enhanced BNS merger rates ($\sim 2\times$ local rate) and BBH rates ($\sim 2\times$). Falsification: no redshift evolution of merger rates in massive hosts detected by Einstein Telescope or Cosmic Explorer.
4.13.5 Predictions in Wider Survey Regimes
Several theoretical predictions extend beyond the present JWST sample and define additional falsification opportunities in wider survey regimes:
- Euclid Wide ($N \sim 300{,}000$ massive galaxies, $z = 0.9$–$1.8$): Typical $\Gamma_t \approx 1.25$ predicts a 25% age offset at fixed $z$. Combined sensitivity reaches $\rho_{\rm min} = 0.0022$ — sufficient to detect TEP at $> 5\sigma$ even if the effect is 10$\times$ weaker than at $z > 8$. Key falsification: no mass-dependent age offset at $z \sim 1.5$.
- Roman Supernova Survey ($N \sim 2{,}700$ SNe Ia, $z < 1.7$): TEP predicts a $1.28\times$ SN Ia rate enhancement in massive hosts ($\Gamma_t \approx 1.28$) and a strengthening Ia/CC ratio with host $\Gamma_t$ (§4.7.1). Key falsification: no host mass dependence in SN rates at $z > 1$.
- Roman High-Latitude ($N \sim 500{,}000$ at $z > 2.5$): Tests the gas vs. stellar metallicity discriminant (§4.7.2) and morphology–$\Gamma_t$ correlation. Key falsification: strong [O III]/H$\beta$–$\Gamma_t$ correlation.
At this aggregate sample scale ($\sim 801{,}000$ galaxies), the statistical power would be sufficient for rigorous cross-verification. Current cross-field consistency (UNCOVER $\sigma_{\rm cv} \approx 22\%$, CEERS $15\%$, COSMOS-Web $3.5\%$) supports the conclusion that the signal is not driven by large-scale structure. Full theoretical predictions are detailed in Appendix C.5.
4.13.6 A Critical Experiment: Synchronization Holonomy
All studies testing the TEP framework are ultimately falsifiable by a single class of experiment that no current precision test has performed: a closed-loop, direction-reversing, one-way time-transfer test targeting the synchronization holonomy $H \equiv \oint_C d\tau_{\rm prop}$. Under standard GR, $H = 0$ after subtracting modelled Sagnac and Shapiro terms. Under TEP, $H \neq 0$ if the disformal coupling $B(\phi) \neq 0$, with a predicted amplitude:
where $\mathcal{A}$ is the loop area. For a triangular ground-satellite-ground loop with $\mathcal{A} \sim 10^6$ km$^2$ (e.g., two ground stations and one MEO satellite), the predicted holonomy is $H_{\rm resid} \sim 10^{-19}$ s — at the frontier of current optical clock technology but achievable with next-generation transportable optical lattice clocks (Lisdat et al. 2016; Grotti et al. 2018). Three experimental configurations are ranked by discriminating power:
- Tier 1 (Decisive): Closed triangular time-transfer loop with three optical clocks at $\sim 1{,}000$ km separation, targeting $H_{\rm resid}$ at $10^{-19}$ s after GR subtraction. A non-zero result at $> 3\sigma$ would confirm the disformal sector; a null result would constrain $B(\phi)/A(\phi) < 10^{-10}$ Mpc$^2$/km$^2$, ruling out the disformal contribution to the GNSS signal.
- Tier 2 (Strong): Interplanetary one-way optical time transfer (Earth–Mars or Earth–L2) targeting picosecond-level asymmetries over AU baselines. Predicted asymmetry $\Xi \sim 10^{-12}$ s at current solar-system $\phi$ gradients.
- Tier 3 (Confirmatory): Roman/Euclid population statistics ($N > 800{,}000$; §4.13.5 above) — these test the conformal sector ($A(\phi)$, which governs $\Gamma_t$) independently of the disformal sector. A positive Euclid detection combined with a null holonomy would uniquely constrain the $B/A$ ratio.
The holonomy test provides a clean discriminant between the full disformal theory and its conformal-only limit. Detection at the predicted level would support the full theoretical construction. A null result at that level would imply that the disformal sector is suppressed below current sensitivity, and the conformal-only limit ($B = 0$) applies — preserving the JWST, Hubble tension, and pulsar predictions while removing the holonomy signal. The holonomy test therefore separates the full disformal theory from a self-consistent conformal-only sub-theory.
4.14 Discriminant Tests
While the current evidence base is robust ($> 6\sigma$), the mass-proxy degeneracy remains a formal logical possibility until broken by kinematic or environmental discriminants that reverse the sign of the prediction. Two critical discriminants are especially valuable because TEP and standard physics predict opposite signs, making the distinction comparatively insensitive to calibration systematics. These tests are now partially executable with the current local data, although neither yet reaches the decisive regime that a dedicated spectroscopic campaign could provide.
4.14.1 The Protocluster Switch
Standard hierarchical assembly predicts that galaxies in dense environments (protoclusters) form earlier and evolve faster than field galaxies of the same mass ("downsizing"). TEP predicts the opposite: dense environments exceed the screening threshold density $\rho_c$, suppressing the scalar field ($\Gamma_t \to 1$), while field galaxies remain unscreened ($\Gamma_t > 1$).
- Standard Prediction: Cluster galaxies form earlier and are older.
- TEP Prediction: Cluster Age $<$ Field Age (Negative Environmental Gradient).
This sign flip is immune to mass-proxy arguments because halo mass is controlled. A live implementation using the reproducible DJA NIRSpec merged table and the local JADES spectroscopic-age companion does not yet recover a clean reversal. The primary $5 \le z < 8$ DJA test uses within-footprint overdensity estimates and UV-slope $\beta$ as the available age proxy. After matching on stellar mass, redshift, and footprint, the dense-minus-field contrast is $\Delta\beta_{\rm resid} = +0.049$ with 95% CI $[-0.040, +0.141]$; the interval crosses zero and the sign is not in the TEP direction. The raw $\beta$ contrast is essentially null ($\Delta\beta = -0.004$), while the low-$N$ JADES spectroscopic-age companion is too small to decide the issue. The correct conclusion is therefore mixed rather than negative in a strong sense: with existing local data, the protocluster-switch test has now been executed, but it does not yet produce the decisive dense-younger-than-field reversal that would strongly favor TEP.
4.14.2 The IFU Gradient Inversion
Standard inside-out growth predicts galaxies form their cores first, leading to older, redder centers and younger, bluer outskirts. TEP predicts Core Screening: the deep potential of the central bulge screens the scalar field locally ($\Gamma_t \to 1$), restoring standard time, while the lower-density disk/outskirts remain enhanced ($\Gamma_t > 1$), accelerating their apparent aging.
- Standard Prediction: Red Core, Blue Outskirts (Positive Color Gradient in the current pipeline convention).
- TEP Prediction: Blue Core, Red Outskirts (Negative Color Gradient in the current pipeline convention).
The JADES resolved photometry (L2) therefore remains informative, but the corrected live discriminant is now best read as a two-part result. Under the current step-139 convention, negative colour gradients correspond to the TEP-sign bluer-core direction. After correcting $\Gamma_t$ to use direct halo masses from the local physical catalog when available, the sample contains a small literal $\Gamma_t > 1$ tail ($18/277$ galaxies). That literal sign-reversal test is still underpowered and non-decisive: after mass+$z$ residualization, the negative-gradient fraction is $0.500$ in the $\Gamma_t > 1$ tail versus $0.533$ in the $\Gamma_t < 1$ branch (Fisher $p = 0.70$). However, once the known mass-bias over-control is handled using the live L4-motivated debiased mass correction, the broader q33/q67 screening split becomes directionally supportive: the negative-gradient fraction rises from $0.457$ to $0.581$ (Fisher $p = 0.061$) with mean residual contrast $\Delta = -0.055$. Combined with the strong direct-mass DR5 morphology support, the resolved branch now reads as a coherent ancillary spatial indication rather than as a gradient-only null. A true IFU measurement of Balmer absorption or D4000 remains the cleanest next step.
4.14.3 Pre-registration Commitment
To ensure the statistical validity of these tests, robust discrimination between the Protocluster Switch and IFU Gradient Inversion depends on pre-registered analysis protocols before any new high-resolution dataset is inspected. Such pre-registration defines the exact sample selection criteria, density estimators, and spectral fitting windows, preventing post-hoc parameter tuning. This adherence to open-science best practices keeps the resulting discrimination between TEP and $\Lambda$CDM robust against "researcher degrees of freedom."
4.15 Conclusion of the Discussion
The TEP framework offers a unified, zero-parameter account of the coherent set of anomalies observed by JWST at $z > 5$. By challenging the isochrony axiom—an assumption, not a measurement—it addresses the "anomalous" galaxy problem, explains the "Little Red Dot" black hole population, and predicts the dust and sSFR trends that standard models struggle to reproduce jointly. While mass-proxy degeneracies prevent a strictly mathematical certainty using current photometry alone, the convergence of the two primary empirical lines together with the ancillary spatial indication, the derived regime-level dynamical-mass comparison, the cross-domain consistency with $H_0$ and pulsars, and the existence of clear, sign-reversing falsification tests make TEP a parsimonious and predictive candidate for physics beyond $\Lambda$CDM in the high-redshift universe.
5. Conclusion
JWST has revealed a coherent pattern of anomalies at $z > 5$: ultra-massive galaxies with star formation efficiencies exceeding $\Lambda$CDM limits, overmassive black holes in Little Red Dots, and stellar masses that exceed dynamical masses. What links these anomalies is not merely that they are surprising, but that they all cluster in the deepest gravitational potentials and all point in the same direction — photometrically inferred stellar properties appear too large and too early. This work tested whether a single violation of the isochrony axiom, encoded by the chameleon-screened Temporal Equivalence Principle (TEP), can account for that shared structure. Using the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$ with no JWST retuning, the framework reproduces the scale of the Red Monster efficiency excess, yields a regime-level reconciliation of the $M_*/M_{\rm dyn}$ anomaly, and provides a kinematic route to rapid black-hole growth in Little Red Dots.
5.1 Synthesis of Results
The core empirical case consists of two primary empirical lines, one ancillary spatial indication, one derived regime-level comparison, and one direct kinematic test. The primary large-sample lines are the dust–$\Gamma_t$ / AGB-threshold result (L1) and the mass–sSFR inversion (L3). The SUSPENSE kinematic comparison (L5) supplies the strongest direct test of the photometric mass-proxy circularity objection. Evaluating $\Gamma_t$ derived from dynamically measured masses ($M_{\rm dyn}$) against spectral ages in $N=15$ quiescent galaxies at $z=1.2$–$2.3$ demonstrates that temporal evolution tracks the depth of the gravitational potential more closely than photometric stellar mass at fixed redshift ($\rho = +0.733$, $p=1.9 \times 10^{-3}$, versus $\rho = +0.493$, $p=0.062$). The strongest feature is the one-sided conditional asymmetry: $\Gamma_t$ retains significant residual age information after $M_*$+$z$ control ($\rho = +0.624$, $p=0.0129$), whereas $M_*$ contributes no residual signal once $\Gamma_t$+$z$ are controlled ($\rho = -0.036$, $p=0.898$). A sigma-based kinematic sample ($N = 83$, six surveys, $z = 1.2$–$7.6$) breaks the mass-proxy circularity directly by computing $\Gamma_t$ from velocity dispersion alone with zero SED dependence; the TEP-specific functional form adds significant predictive power for $M_{*,\rm obs}$ beyond $\sigma$ and $z$ individually (partial $\rho = +0.326$, $p = 0.003$). The ancillary spatial indication (L2) supplies strong controlled structural support via the preferred JADES DR5 direct-mass morphology sample, while L4 provides a derived regime-level reconciliation of the dynamical-mass tension.
Follow-up branches strengthen this hierarchy. COSMOS2025 adds blank-field support in dust, with sSFR support in the matched $z = 8$–9 bin and selection-sensitivity in the ultrahigh-$z$ tail. UNCOVER DR4 MegaScience recovers the expected emergence above $z \sim 7$; the apparent $z = 9$–12 weakness reflects a sensitivity-limited tail with compressed dust posteriors and inflated redshift uncertainties. JADES DR5 supplies direct-mass morphology support in which two size proxies, Gini, and $\sigma_\star$ all remain supportive after mass+$z$ control. The DJA-based GOODS-S and Balmer analyses remain supplementary spectroscopic checks.
5.2 Interpretative Framework
The interpretive claim is direct. Physical processes require proper time. Standard inference assumes that proper time and cosmic time can be identified. The JWST anomalies are what appear when that identification fails in deep potentials. Under TEP, time behaves as a local field, and a single parameter ($\Gamma_t$) propagates coherently through stellar ages, mass-to-light ratios, dust buildup, star-formation diagnostics, and dynamical-mass comparisons.
This empirical case moves beyond generalized mass correlations through two especially important lines of proof:
- Breaking the photometric mass proxy: The L5 kinematic test breaks the photometric mass-circularity objection. The SUSPENSE age-based comparison ($N = 15$) shows that the dynamical predictor retains residual age information after $M_*$+$z$ control, whereas $M_*$ does not after $\Gamma_t$+$z$ control. The sigma-based kinematic sample ($N = 83$, six surveys) breaks the circularity directly: a $\Gamma_t$ computed from velocity dispersion alone, with zero SED dependence, adds significant predictive power for $M_{*,\rm obs}$ beyond $\sigma$ and $z$ (partial $\rho = +0.326$, $p = 0.003$). The federated direct-kinematic package comprises three counted supportive branches.
- Physical thresholds: The secondary dust branch recovers a sharp AGB-onset threshold at $t_{\rm eff} \approx 0.3$ Gyr, beating mass-matched models. It tracks a specific stellar-evolution physical clock rather than a generalized proxy.
The explicit alternative-model comparison points in the same direction. A nested Bayesian test favors TEP over Standard Physics, Bursty SF, and Varying IMF in raw joint space, and after linear mass+$z$ structure is removed it decisively favors TEP over both the residual null and a constrained AGN alternative. The strongest Bayesian support therefore sits on the part of the signal that survives the anti-circularity control, not on the raw mass-correlated component alone.
The cross-domain consistency of $\alpha_0$ remains a major feature of the evidence base. The local Cepheid analysis provides the external prior $\alpha_0 = 0.58 \pm 0.16$ from 29 hosts at $z \approx 0$, while the JWST high-redshift observables recover mutually concordant values near $\alpha_0 \approx 0.55$. Because those high-redshift recoveries arise within the same mass-proxy-linked programme, they are best interpreted as self-consistency checks across observables rather than as replacement calibrations or formal exclusion probabilities for every alternative systematic.
Key signatures survive a 0.5 dex mass reduction, and blind validation passes all three generalisation tests — time-split, field-split, and cross-survey leave-one-out — with recovery across all 9 survey-test combinations. Each of the three independent JWST surveys confirms the dust relation individually above $5\sigma$, and all three independently confirm that $t_{\rm eff}$ outperforms $t_{\rm cosmic}$ at $>5\sigma$ (Steiger $Z = 5.3$–$16.8$), ruling out pure redshift ordering. A Fisher combination across the three photometric datasets gives $z = 24.4\sigma$ ($p = 2.8 \times 10^{-132}$). Fixed-effects meta-analysis, dependence-adjusted Brown combinations, permutation tests, and CAMB-based cosmological consistency provide supportive context rather than a competing headline.
The main remaining limitations (§4.11–§4.12) are the self-referential evidence base and the fact that some secondary morphology and emission-line diagnostics still weaken under biased mass control. Those caveats matter, but they now sit within a manuscript whose evidential spine is coherent: the two large-sample primary lines (L1 and L3), together with a direct kinematic comparison whose strongest feature is a one-sided conditional asymmetry that materially narrows the photometric mass-circularity objection. The ancillary screening branch remains supportive rather than decisive, and the dynamical-mass branch remains a derived regime-level comparison rather than a primary empirical line.
5.3 Falsification Criteria
TEP makes specific, quantitative predictions that can be tested against additional data in the same observables and in wider survey regimes. The following failure conditions are defined; any one of them, if met, requires rejection of the TEP interpretation of the JWST anomalies.
| Observable | Standard Physics Prediction | TEP Prediction | Falsification Criteria |
|---|---|---|---|
| Mass-Dust ($z > 8$) | No correlation or Negative | Strong Positive ($\rho > 0.4$) | $\rho \approx 0$ or Negative |
| Balmer Absorption | Correlates with $z$ | Correlates with $M_*$ at fixed $z$ | No mass trend |
| LRD Host Size | No dependence | Only in Compact ($r_e < 1$ kpc) | LRDs in large disks |
| Cluster vs Field | Cluster galaxies older | Cluster galaxies younger (screened) | Field $\approx$ Cluster or Field < Cluster (Standard) |
| [OIII]/H$\beta$ vs $\Gamma_t$ | Correlates with mass | Weak correlation ($\rho < 0.1$) | Strong correlation ($\rho > 0.3$) |
| Radial Age Gradient | Inside-out (Red Core, Blue Out) | Core Screening (Blue Core, Red Out) | Standard inside-out gradients in massive high-z galaxies |
| Time-Lens Ordering | Properties track $z_{\rm obs}$ | Properties track $z_{\rm eff}$ | Correlation with $z_{\rm obs}$ is stronger than with $z_{\rm eff}$ |
| Type Ia / CC SN Ratio | No $\Gamma_t$ dependence | Ia/CC $\propto \Gamma_t^{0.5}$; Ia rate $4.4\times$ enhanced in high-$\Gamma_t$ hosts; CC rate unchanged ($\rho \approx 0$) | Significant deviation from predicted scaling across $\Gamma_t$ at fixed $M_*$ (e.g. Roman Space Telescope High-Latitude Time Domain Survey) |
| $\alpha_0$ Recovery | N/A | $\alpha_0 \approx 0.58$ (external prior); JWST internal recovery $\approx 0.548$ | Independent recovery yielding $\alpha_0 > 1.2$ or $\alpha_0 < 0.2$ |
5.4 Reproducibility
This analysis is designed to be fully reproducible. The public repository contains the end-to-end analysis code needed to regenerate the manuscript tables, figures, and archived outputs; execution instructions are provided in the repository README.
5.5 Data Availability
The manuscript source, complete analysis code, generated figures, intermediate outputs, and the raw and processed catalogs are available on GitHub and archived on Zenodo for long-term reproducibility.
- Analysis repository: github.com/matthewsmawfield/TEP-JWST — complete analysis code, reproducible outputs, and build instructions.
- Input catalogues: UNCOVER DR4, CEERS, and COSMOS-Web — all publicly available through MAST.
- Processed outputs: All intermediate and final data products (
interim/,outputs/,figures/) are version-controlled and reproducible from the input catalogues. - Documentation:
README.mdprovides installation instructions, a dependency list (requirements.txt), and a quick-start guide.
The full TEP theoretical framework series (Papers 1–12) is available on Zenodo; DOIs are listed in the References. Key identifiers: Paper 1 — TEP foundation (10.5281/zenodo.18204190); Papers 2–4 — GTE (10.5281/zenodo.17127229); Paper 5 — GL (10.5281/zenodo.17982540); Paper 6 — GTE synthesis (10.5281/zenodo.18004832); Paper 7 — UCD (10.5281/zenodo.18064366); Paper 8 — RBH-1 (10.5281/zenodo.18059251); Paper 9 — SLR (10.5281/zenodo.18064582); Paper 10 — EXP (10.5281/zenodo.18109761); Paper 11 — COS (10.5281/zenodo.18165798); Paper 12 — H₀ (10.5281/zenodo.18209703).
References
Abbott, B. P., et al. 2017, ApJL, 848, L13. Gravitational Waves and Gamma-Rays from a Binary Neutron Star Merger: GW170817 and GRB 170817A.
Arrabal Haro, A., et al. 2023, Nature, 622, 707. Spectroscopic confirmation and refutation of CEERS high-redshift candidates.
Behroozi, P., Wechsler, R. H., Hearin, A. P., & Conroy, C. 2019, MNRAS, 488, 3143. UNIVERSEMACHINE: The correlation between galaxy growth and dark matter halo assembly from z = 0−10.
Berg, D. A., et al. 2013, ApJ, 775, 93. New Detections of C/O Abundance Ratios in Metal-Poor Dwarf Galaxies.
Boyer, M. L., et al. 2025, ApJ, 991, 24. Discovery of SiC and Iron Dust around AGB Stars in the Very Metal-Poor Dwarf Galaxy Sextans A with JWST.
Bertotti, B., Iess, L., & Tortora, P. 2003, Nature, 425, 374. A test of general relativity using radio links with the Cassini spacecraft.
Boylan-Kolchin, M. 2023, Nature Astronomy, 7, 731. Stress testing ΛCDM with high-redshift galaxy candidates.
Brammer, G. B., van Dokkum, P. G., & Coppi, P. 2008, ApJ, 686, 1503. EAZY: A Fast, Public Photometric Redshift Code.
Brout, D., et al. 2022, ApJ, 938, 110. Type Ia supernova host-mass step measurements in Pantheon+.
Brax, P., van de Bruck, C., Davis, A.-C., Khoury, J., & Weltman, A. 2004, PhRvD, 70, 123518. Small scale structure formation in chameleon cosmology.
Bruzual, G. & Charlot, S. 2003, MNRAS, 344, 1000. Stellar population synthesis at the resolution of 2003.
Burrage, C. & Sakstein, J. 2018, Living Reviews in Relativity, 21, 1. Tests of Chameleon Gravity.
Carniani, S., et al. 2024, Nature, 633, 318. A shining cosmic dawn: spectroscopic confirmation of two luminous galaxies at z > 14.
Carnall, A. C., McLure, R. J., Dunlop, J. S., & Davé, R. 2018, MNRAS, 480, 4379. Inferring the star formation histories of massive quiescent galaxies with BAGPIPES.
Carnall, A. C., et al. 2023, Nature, 619, 716. A massive quiescent galaxy at redshift 4.658.
Chemerynska, I., Atek, H., et al. 2024, MNRAS, 531, 2615. JWST UNCOVER: The Overabundance of Ultraviolet-luminous Galaxies at z > 9.
Chworowsky, K., et al. 2025, arXiv:2509.07695. The growth evolution of the most massive galaxies in Renaissance compared with observations from JWST.
Claeyssens, A., et al. 2023, MNRAS, 520, 2162. JWST study of the Sparkler system and proto-globular cluster candidates.
Conroy, C., Gunn, J. E., & White, M. 2009, ApJ, 699, 486. The Propagation of Uncertainties in Stellar Population Synthesis Modeling.
Cox, T. J., et al. 2025, ApJS (in press). CEERS DR1 photometric and physical parameter catalog.
Curti, M., et al. 2023, MNRAS, 518, 425. Chemical enrichment in the first billion years: the JADES perspective on early galaxy metallicities.
Curtis-Lake, E., et al. 2023, Nature Astronomy, 7, 622. Spectroscopic confirmation of four metal-poor galaxies at z = 10.3–13.2.
D'Eugenio, F., et al. 2025, ApJS (in press). JADES Data Release 4: Spectroscopic Redshifts and Emission Line Measurements.
Danhaive, J., et al. 2025, arXiv:2503.21863. The dawn of disks: JWST/NIRCam grism kinematics of galaxies at z ~ 4–6.
de Graaff, A., et al. 2024a, A&A, 684, A87. Ionised gas kinematics and dynamical masses of z ≳ 6 galaxies from JADES/NIRSpec high-resolution spectroscopy.
de Graaff, A., et al. 2024b, Nature, 630, 846. A dormant overmassive black hole in the early Universe.
Endsley, R., et al. 2023, MNRAS, 524, 2312. A JWST/NIRCam Study of Key Contributors to Reionization: The Star-forming and Ionizing Properties of UV-faint z ∼ 7–8 Galaxies.
Esdaile, J., et al. 2021, ApJL, 908, L35. Massive Quiescent Galaxies at z ~ 3: A Comparison of Selection, Stellar Population, and Structural Properties with Simulation Predictions.
Eisenstein, D. J., et al. 2023, arXiv:2306.02465. Overview of the JWST Advanced Deep Extragalactic Survey (JADES).
Finkelstein, S. L., et al. 2023, ApJL, 946, L13. CEERS early release science survey overview.
Freedman, W. L., Madore, B. F., Hoyt, T. J., et al. 2024, arXiv:2408.06153. Status Report on the Chicago-Carnegie Hubble Program (CCHP).
Fujimoto, S., et al. 2023, ApJL, 949, L25. JWST/NIRSpec spectroscopic confirmation of z > 8 CEERS candidates.
Furtak, L. J., et al. 2023, MNRAS, 523, 4568. JWST UNCOVER: The Strong Lensing Model of Abell 2744.
Grotti, J., et al. 2018, Nature Physics, 14, 437. Geodesy and metrology with a transportable optical clock.
Greene, J. E., et al. 2024, ApJ, 964, 39. UNCOVER: The Growth of the First Massive Black Holes.
Hainline, K. N., et al. 2023, arXiv:2306.02468. The Cosmos in its Infancy: JADES Galaxy Candidates at z > 8 in GOODS-S and GOODS-N.
Heintz, K. E., et al. 2023, ApJL, 953, L10. Extreme Damped Lyman-α Absorption in Young Star-Forming Galaxies at z = 9–11.
Ilie, C., et al. 2025, PNAS. Supermassive Dark Star candidates seen by JWST.
Jiang, Y.-F., Stone, J. M., & Davis, S. W. 2019, ApJ, 880, 67. Super-Eddington Accretion Disks around Supermassive Black Holes.
Jin, B., et al. 2025, A&A, 698, A30. Spatially resolved colours and sizes of galaxies at z ~ 3–4.
Ju, M., et al. 2025, arXiv:2506.12129. A 13-Billion-Year View of Galaxy Growth: Metallicity Gradients.
Kelly, P. L., et al. 2010, ApJ, 715, 743. Host-galaxy mass step in Type Ia supernova distances.
Khoury, J. & Weltman, A. 2004, PhRvL, 93, 171104. Chameleon Fields: Awaiting Surprises for Tests of Gravity in Space.
Kawinwanichakij, L., et al. 2025, ApJ (in press). Environmental dependence of galaxy morphology at z = 3–4.
Kocevski, D. D., et al. 2023, ApJL, 954, L4. Hidden Little Monsters: Spectroscopic Identification of Low-Mass, Broad-Line AGN at z > 5 with CEERS.
King, A. R., Lasota, J.-P., & Kluzniak, W. 2023, MNRAS, 519, 5765. Super-Eddington accretion: models and applications.
Kodric, M., Riffeser, A., Seitz, S., et al. 2018, ApJ, 864, 59. Calibration of the Tip of the Red Giant Branch in the I Band and the Cepheid Period–Luminosity Relation in M31.
Kokorev, V., et al. 2024, arXiv:2401.09981. A Census of Photometrically Selected Little Red Dots at 4 < z < 9 in JWST Blank Fields. github.com/VasilyKokorev/lrd_phot
Larson, R. L., et al. 2023, ApJ, 953, 34. A CEERS Discovery of an Accreting Supermassive Black Hole 570 Myr after the Big Bang.
Labbé, I., et al. 2023, Nature, 616, 266. A population of red candidate massive galaxies ~600 Myr after the Big Bang. Data: github.com/ivolabbe/red-massive-candidates
Leja, J., et al. 2019, ApJ, 876, 3. How to Measure Galaxy Star Formation Histories. II. Nonparametric Models.
Lisdat, C., et al. 2016, Nature Communications, 7, 12443. A clock network for geodesy and fundamental science.
Li, Q., et al. 2025, MNRAS, 539, 1796. EPOCHS Paper X: Environmental effects on Galaxy Formation and Protocluster Galaxy candidates at 4.5 < z < 10.
Maiolino, R., et al. 2024, Nature, 627, 59. A small and vigorous black hole in the early Universe.
Matthee, J., et al. 2024, ApJ, 963, 129. Little Red Dots: An Abundant Population of Faint Active Galactic Nuclei at z ~ 5 Revealed by JWST.
Meng, X.-L., Rosenthal, R., & Rubin, D. B. 1992, Biometrika, 79, 425. Comparing correlated correlation coefficients.
Middleton, M. J., et al. 2015, MNRAS, 447, 3243. NuSTAR reveals extreme absorption in z = 2–3 type 2 quasars.
Mota, D. F. & Shaw, D. J. 2007, PhRvD, 75, 063501. Evading equivalence principle violations, cosmological, and other experimental constraints in scalar field theories with a strong coupling to matter.
Mowla, L., et al. 2022, ApJL, 937, L35. The Sparkler: Evolved High-Redshift Globular Cluster Candidates Captured by JWST.
Naidu, R. P., et al. 2022, ApJL, 940, L14. Two Remarkably Luminous Galaxy Candidates at z ≈ 10–12 Revealed by JWST.
Nanayakkara, T., et al. 2024, Science, 384, 890. A massive galaxy that was quenched by z ∼ 3.
Nedkova, K. V., et al. 2025, A&A. Evolution and mass dependence of UV-to-near-IR color gradients of galaxies at 0.5 < z < 2.5.
Nakajima, K., et al. 2023, ApJS, 269, 33. JWST Census for the Mass-Metallicity Star Formation Relation at z = 4–10.
Pérez-González, P. G., et al. 2024, ApJ, 968, 4. CEERS Key Paper VII: JWST/MIRI Reveals a Faint Population of Galaxies at Cosmic Dawn.
Planck Collaboration, Aghanim, N., et al. 2020, A&A, 641, A6. Planck 2018 results. VI. Cosmological parameters.
Price, S. H., et al. 2024, ApJ, 964, 73. UNCOVER: JWST spectroscopy of three cold brown dwarfs at kiloparsec-scale distances.
Rieke, M. J., et al. 2023, PASP, 135, 028001. JWST NIRCam Performance: Commissioning and Calibration.
Riess, A. G., et al. 2022, ApJL, 934, L7. A Comprehensive Measurement of the Local Value of the Hubble Constant with 1 km/s/Mpc Uncertainty from the Hubble Space Telescope and the SH0ES Team.
Saldana-Lopez, A., et al. 2025, arXiv:2501.17145. Feedback and dynamical masses in high-z galaxies: the advent of high-resolution NIRSpec spectroscopy.
Scholtz, J., et al. 2025, A&A (in press). JADES: Spectroscopic properties of faint AGN at z > 4.
Shamir, L. 2025, MNRAS, 538, 76. The distribution of galaxy rotation in JWST Advanced Deep Extragalactic Survey.
Slob, M., et al. 2025, A&A (in press). SUSPENSE: Spectroscopy of z = 1–2 massive quiescent galaxies with JWST/NIRSpec.
Shuntov, M., et al. 2025, ApJS (in press). COSMOS-Web DR1 / COSMOS2025 catalog.
Smawfield, M. L. 2025, Zenodo. Global Time Echoes: Empirical Validation of the Temporal Equivalence Principle via GNSS Timing Networks. DOIs: 10.5281/zenodo.17127229, 10.5281/zenodo.17517141, 10.5281/zenodo.17860166. Cited for GNSS correlation length $\lambda = 4{,}201 \pm 1{,}967$ km and 7 independent signatures (§4.2 Table 11).
Smawfield, M. L. 2025, Zenodo. Temporal-Spatial Coupling in Lensing: Phantom Mass and the Isochrony Axiom. DOI: 10.5281/zenodo.17982540. Cited for the lensing phantom mass mechanism and Earth–galaxy $r_V \propto M^{1/3}$ scaling (§4.2 Table 11).
Smawfield, M. L. 2025, Zenodo. Universal Critical Density: Unifying Atomic, Galactic, and Compact Object Scales. DOI: 10.5281/zenodo.18029598. Cited for derivation of the screening threshold $\rho_c \approx 20$ g/cm³ (§2.3.2.2, §4.2 Table 11).
Smawfield, M. L. 2025, Zenodo. The Soliton Wake: Identifying RBH-1 as a Gravitational Soliton. DOI: 10.5281/zenodo.18042456. Cited for geometric consistency of the RBH-1 velocity discontinuity (§4.10).
Smawfield, M. L. 2025, Zenodo. Global Time Echoes: Optical Validation via Satellite Laser Ranging. DOI: 10.5281/zenodo.18064582. Cited for independent optical-domain confirmation of GNSS correlation structure using LAGEOS-1/2 (§4.2 Table 11).
Smawfield, M. L. 2025, Zenodo. What Do Precision Tests of General Relativity Measure? DOI: 10.5281/zenodo.18109761. Cited for the formal taxonomy of GR test limitations and the conformal loophole argument (§4.2 Table 11, §4.4.7).
Smawfield, M. L. 2026, Zenodo preprint. Suppressed Density Scaling in Globular Cluster Pulsars. Not yet peer-reviewed; cited for the screening threshold $\sigma > 165$ km/s, binary inversion ($-0.31$ dex, $p = 0.01$), and COSMOGRAIL lensing constraint $|\Gamma| \leq 60$ days/decade (§2.3.2.2, §4.2 Table 11, §4.4.3).
Smawfield, M. L. 2026, Zenodo preprint. The Cepheid Bias: Resolving the Hubble Tension via Environment-Dependent Period-Luminosity Relations. Not yet peer-reviewed; cited for the external Cepheid prior on the coupling constant, $\alpha_0 = 0.58 \pm 0.16$, derived from SH0ES local distance-ladder host stratification (§2.3.2, §3.1, §4.3.6). In the present JWST analysis, the tighter recovery $\alpha_0 = 0.548 \pm 0.010$ is treated separately as an internal high-redshift concordance check rather than as the source calibration (§4.3.6, §4.12 item 9).
Song, M., et al. 2016, ApJ, 825, 5. The Evolution of the Galaxy Stellar Mass Function at z = 4–8.
Suess, K. A., et al. 2024, ApJL, 976, L21. UNCOVER: MegaScience Photometric Catalogs.
Sullivan, M., et al. 2010, MNRAS, 406, 782. Type Ia supernova host-galaxy correlations and the mass step.
Tanaka, M., et al. 2019, ApJL, 885, L34. Stellar Velocity Dispersion of a Massive Quenching Galaxy at z ∼ 4.
Taylor, A., et al. 2025, arXiv:2505.04609. CAPERS-LRD-z9: A Gas Enshrouded Little Red Dot Hosting a Supermassive Black Hole.
Tripodi, R., et al. 2025, Nature Communications. CANUCS-LRD-z8.6: A rapidly accreting overmassive black hole at z = 8.6.
van Dokkum, P., et al. 2025, ApJ (in press). A Candidate Runaway Supermassive Black Hole.
VandenBerg, D. A., et al. 2013, ApJ, 775, 134. Milky Way globular cluster ages.
Wang, B., et al. 2024, ApJS, 270, 12. UNCOVER DR4 stellar population synthesis catalog.
Weibel, A., et al. 2024, MNRAS, 533, 1808. Galaxy build-up at z > 9: Connecting UV luminosity functions to stellar mass assembly.
Xiao, M., et al. 2024, Nature, 635, 303. Three ultra-massive galaxies in the early Universe.
Yang, G., et al. 2025, ApJ (in press). DJA GOODS-S: Spectrophotometric Catalog of 7,325 Galaxies.
Appendix A: Theoretical Foundation
A.1 The TEP Action and Field Equations
The Temporal Equivalence Principle is formulated as a scalar-tensor theory with a two-metric structure. The complete Lagrangian density in the Einstein frame is:
where $\tilde{g}_{\mu\nu} = e^{2\beta\phi/M_{\rm Pl}} g_{\mu\nu}$ is the Jordan-frame metric to which matter couples, $\beta = \alpha_0$ is the universal coupling, the zero-parameter JWST predictions adopt the external Cepheid prior $\alpha_0 = 0.58 \pm 0.16$, $K(\phi) = 1$ (canonical kinetic term), and $V(\phi)$ is the scalar potential. The high-redshift concordance analyses later recover values near $\alpha_0 \approx 0.55$ as internal consistency checks.
The complete action in the Einstein frame is:
where the gravitational sector is:
the scalar field sector is:
and matter couples to the Jordan-frame metric:
The conformal factor is $A(\phi) = \exp(\beta\phi/M_{\rm Pl})$ with $\beta = \alpha_0$. The disformal term $B(\phi)$ is constrained by GW170817 to be negligible at late times.
A.1.1 Field Equations
Variation with respect to $g_{\mu\nu}$ yields the modified Einstein equations:
where the scalar field stress-energy is:
Variation with respect to $\phi$ yields the scalar field equation:
where $T^{(\rm matter)} = \tilde{g}^{\mu\nu} \tilde{T}_{\mu\nu}$ is the trace of the matter stress-energy tensor in the Jordan frame.
A.1.2 Screening Mechanism
TEP requires a screening mechanism to reconcile the halo-scale coupling ($\alpha_0 \approx 0.55$) with Solar System precision tests. Phenomenologically, the screening is characterized by a density-dependent soliton radius $R_{\text{sol}}$, which describes the spatial extent of the saturated scalar field configuration around a mass $M$:
where $\rho_c \approx 20$ g/cm³ is the critical saturation density. The screening factor quantifies the degree of suppression:
In dense environments ($\rho \gg \rho_c$), the screening factor $S \gg 1$, meaning the physical object is deeply embedded within the soliton core and GR is recovered. In diffuse environments ($\rho \ll \rho_c$), $S \ll 1$, and the scalar field modifies local proper time. This phenomenological hierarchy has been validated across 15 orders of magnitude in density, from gas giants to binary pulsars. The specific dynamical mechanism responsible for this screening (see §A.1.11) does not affect the empirical hierarchy or the predictions used in the main text.
A.1.3 PPN Parameters
In the unscreened limit, the Eddington PPN parameter is:
For the external Cepheid prior $\alpha_0 = 0.58$, this gives $|\gamma - 1| \approx 0.5$, which would violate Cassini bounds by four orders of magnitude. Screening reduces the effective coupling by the thin-shell factor $\Delta R/R \lesssim 10^{-6}$ for solar system bodies, bringing $|\gamma - 1|_{\rm eff} \lesssim 10^{-6}$ into compliance with observations.
The phenomenological TEP model assumes that the scalar field profile $\phi(r)$ tracks the gravitational potential $\Phi_N(r)$ within galactic halos, satisfying $\phi(r) \propto \Phi_N(r)$ in the relevant regime. To validate this assumption, a full numerical relativity simulation was performed solving the static spherical scalar field equation of motion:
for a standard NFW density profile. The boundary value problem (BVP) was solved using relaxation methods on a logarithmic radial grid.
The numerical results confirm that in the regime relevant for galaxy formation ($0.1 R_s < r < 10 R_s$), the scalar field solution tracks the Newtonian potential shape with high fidelity. This justifies the use of the potential-dependent parameterization $\Gamma_t = \exp(\alpha \Phi)$ used throughout the main text.
A.1.5 Screening Hierarchy Validation
The density-dependent screening hierarchy explains why precision tests of General Relativity show no deviation despite the large halo-scale coupling. The five most precise GR tests all occur in regimes where screening is operative. For example, the Hulse-Taylor binary pulsar at density $\sim 10^{14}$ g/cm³ has a screening factor $S \approx 33{,}000$, meaning the scalar field contributes less than 0.003% to orbital dynamics—consistent with GR verification to 0.2% precision. Conversely, Earth at $\rho \approx 5.5$ g/cm³ sits just below $\rho_c$ with $S \approx 0.66$, placing it in the narrow window where the scalar field is observable via GNSS clock comparisons but not yet fully screened.
A.1.6 Parameter Sensitivity: Red Monster Resolution
The SFE anomaly resolution remains significant ($> 30\%$) over a broad range of physically plausible couplings ($\alpha_0 \in [0.4, 0.8]$). At $\alpha_0 = 0.42$ (lower $1\sigma$ bound), the correction is $\sim 22\%$; at $\alpha_0 = 0.74$ (upper $1\sigma$ bound), the correction is $\sim 48\%$. The result is not a product of fine-tuning (see also §3.2).
A.1.6.1 Structural Assumptions and Priors in the TEP Formula
While the coupling constant $\alpha_0 = 0.58 \pm 0.16$ is drawn from the external Cepheid calibration (not tuned to JWST data), the $\Gamma_t$ formula itself contains fixed structural choices that act as implicit priors. These choices constrain the functional form of the temporal enhancement and deserve explicit acknowledgment:
| Component | Adopted Form | Physical Justification | Alternative Possibilities |
|---|---|---|---|
| Functional form | Exponential: $\Gamma_t = \exp(\alpha \Phi)$ | Conformal coupling $\tilde{g}_{\mu\nu} = A(\phi) g_{\mu\nu}$ with $A(\phi) = \exp(\beta\phi/M_{\rm Pl})$ implies proper time $d\tau \propto A(\phi)^{1/2} dt$ | Power-law, logarithmic, or other monotonic functions of potential depth |
| Redshift scaling | $\alpha(z) = \alpha_0 \sqrt{1+z}$ | Screening weakens at high redshift due to lower background density; $\sqrt{1+z}$ scaling emerges from soliton radius evolution with cosmic mean density | $(1+z)^n$ with $n \neq 0.5$, or constant $\alpha(z) = \alpha_0$ |
| Reference mass evolution | $\log M_{\rm h,ref}(z) = 12.0 - 1.5\log_{10}(1+z)$ | Maintains constant virial velocity (constant potential depth) as characteristic halo mass evolves; exponent 1.5 from $M \propto (1+z)^{-3/2}$ for fixed circular velocity | Fixed reference mass, or different evolutionary index |
| Reference redshift | $z_{\rm ref} = 5.5$ | Normalization epoch chosen near the midpoint of the JWST high-redshift sample; affects absolute scaling but not relative trends | Any $z_{\rm ref} \in [4, 8]$ would shift normalization but preserve rank ordering |
| Potential scaling | $\Phi \propto M^{2/3}$ | Virial theorem: $\Phi \sim M/R \sim M/M^{1/3} = M^{2/3}$ at fixed density | NFW-specific profile with concentration dependence |
Which choices are constrained by theory vs phenomenology?
- Exponential form: Directly derived from the conformal coupling in the TEP action (§A.1). This is not a free choice but a consequence of the two-metric structure.
- $\sqrt{1+z}$ scaling: Motivated by the density-dependent screening hierarchy (§A.1.2), but the exact exponent 0.5 is a simplification. The full screening evolution depends on the time-dependent background density and the scalar field equation of motion. Alternative scalings $(1+z)^n$ with $n \in [0.3, 0.7]$ would produce qualitatively similar results.
- Reference mass evolution: The $M \propto (1+z)^{-3/2}$ scaling maintains constant virial velocity, which is physically motivated but not uniquely determined. The exponent 1.5 could in principle vary between 1.0 (constant mass) and 2.0 (stronger evolution).
- Reference redshift $z_{\rm ref} = 5.5$: This is a normalization choice, not a physical parameter. Shifting $z_{\rm ref}$ rescales all $\Gamma_t$ values uniformly but does not affect rank correlations or relative comparisons.
- $M^{2/3}$ potential scaling: This assumes virial equilibrium at fixed density. Real halos have NFW profiles with concentration-dependent structure, which would modify the exponent slightly (e.g., $M^{0.6}$ to $M^{0.7}$).
How sensitive are the results to these choices? Sensitivity analyses (§3.2, §A.1.6) show that varying $\alpha_0$ over its $1\sigma$ range changes the Red Monster SFE correction from 22% to 48%, demonstrating robustness to the primary free parameter. The structural choices above are harder to vary independently because they are interconnected through the underlying scalar-tensor theory. However, order-of-magnitude estimates suggest:
- Replacing $\sqrt{1+z}$ with $(1+z)^{0.3}$ or $(1+z)^{0.7}$ changes high-redshift $\Gamma_t$ values by factors of $\sim 1.5$–$2$, preserving the qualitative hierarchy.
- Changing the reference mass evolution exponent from 1.5 to 1.0 or 2.0 shifts the mass-dependent gradient but does not eliminate the $\Gamma_t$–dust correlation.
- Adopting an NFW-specific potential with concentration dependence would modify individual $\Gamma_t$ values by $\sim 20$–$30\%$ but would not reverse the sign of any correlation.
Why this matters for intellectual honesty. The claim that TEP makes "zero-parameter predictions" refers to the fact that $\alpha_0$ is drawn from an external calibration (Cepheid distances), not tuned to JWST data. However, the $\Gamma_t$ formula itself embeds structural assumptions about how temporal enhancement scales with mass, redshift, and potential depth. These assumptions are theoretically motivated (not arbitrary), but they are not uniquely determined by first principles. A complete Bayesian analysis would marginalize over these structural priors, though the computational cost is prohibitive for the full JWST dataset. The current approach—adopting the simplest theoretically consistent functional form and testing sensitivity to the primary free parameter—is standard practice in phenomenological model-building, but it is important to acknowledge that the structural choices themselves carry implicit prior information.
Testable Discriminants
Future observations can constrain these structural assumptions:
- Redshift scaling: If TEP is correct, the $\Gamma_t$–dust correlation should strengthen at higher redshift. Comparing $z = 7$–$8$ vs $z = 10$–$12$ samples tests the $\sqrt{1+z}$ scaling.
- Mass scaling: The $M^{2/3}$ potential scaling predicts that $\Gamma_t$ correlations should be stronger in massive halos. Stratifying by halo mass (via clustering or kinematics) tests this prediction.
- Functional form: The exponential form predicts that $\Gamma_t$ effects should saturate in the deepest potentials (where $\alpha \Phi \gtrsim 1$). Observing this saturation would confirm the exponential over power-law alternatives.
A.1.7 Cosmological Constraints (BBN & Structure Formation)
The compatibility of TEP with early universe constraints is explicitly verified below.
Big Bang Nucleosynthesis (BBN): The scalar field equation of motion (Eq. 34) is driven by the trace of the matter stress-energy tensor $T$. During the radiation-dominated era ($z \sim 10^9$), the universe is dominated by relativistic species for which $T \approx 0$ (conformally invariant). Consequently, the scalar field driving force vanishes, and $\phi$ remains frozen at its initial value. Numerical integration of the Friedmann equations with the TEP scalar energy density yields:
- Maximum Hubble rate deviation: $|\Delta H/H|_{\rm max} = 1.7 \times 10^{-13}$
- Deviation at neutron freeze-out: $|\Delta H/H|_{\rm freeze-out} \approx 0$ (below numerical precision)
- Helium-4 abundance shift: $\Delta Y_p = 1.2 \times 10^{-15}$ (fractional: $5.0 \times 10^{-15}$)
- Deuterium abundance shift: $\Delta(D/H) = -5.0 \times 10^{-19}$ (fractional: $-2.0 \times 10^{-14}$)
These shifts are $\sim 10^{12}$ times smaller than current observational uncertainties ($\sigma_{Y_p} \sim 0.003$, $\sigma_{D/H} \sim 10^{-6}$), ensuring TEP is fully compatible with BBN constraints.
Linear Growth & $\sigma_8$: The growth of structure is governed by the modified Jeans equation:
where $G_{\rm eff} = G_N (1 + 2\beta^2)$ in the unscreened regime. For the external Cepheid prior $\alpha_0 = 0.58$ ($\beta \approx 0.58$), the effective gravity is enhanced by a factor of $\sim 1.67$. Scale-independent integration yields $\sigma_8^{\rm TEP} \approx 3.40$—observationally ruled out by Planck ($\sigma_8 = 0.811 \pm 0.006$). This motivates the scale-dependent calculation below.
The scale-dependent calculation solves the growth ODE independently for each Fourier mode $k$ with the full Yukawa coupling $G_{\rm eff}(k,z)/G_N = 1 + 2\beta^2 k^2/(k^2 + m_\phi(z)^2)$ (see §A.1.8.6). The key constraint is:
- The Compton wavelength must be $\lambda_C \lesssim 30\,h^{-1}$ Mpc today to suppress growth on $8\,h^{-1}$ Mpc scales
- For typical chameleon parameters ($\lambda_C \lesssim 1$ Mpc), $\beta_{\rm eff}$ on $R_8$ scales is $\approx 0.005$, and $\sigma_8^{\rm TEP} = 0.811$—identical to Planck
- The predicted $f(z)\sigma_8(z)$ is indistinguishable from $\Lambda$CDM ($\Delta\chi^2 < 10^{-4}$ against 8 RSD data points)
A.1.8 Effective Coupling Constraint from $\sigma_8$
The $\sigma_8$ constraint can be expressed directly as an upper bound on the effective scalar-tensor coupling on linear scales. In the simplest unscreened limit, $G_{\rm eff}/G_N = 1 + 2\beta^2$. Using the linear-theory estimate and demanding agreement with Planck at 2$\sigma$ gives:
This implies that any fifth force responsible for the halo-scale temporal enhancement must be screened and/or short-ranged on $\sigma_8$ scales. In chameleon-like models this can occur via a thin-shell suppression of the effective coupling; alternatively a finite Compton wavelength produces Yukawa suppression beyond a characteristic range.
A.1.9 Scale-Dependent Screening: A Quantitative Model
The apparent tension between the halo-scale coupling ($\alpha_0 = 0.58$, external Cepheid prior) and the $\sigma_8$ constraint ($\beta_{\rm eff} \lesssim 0.055$) is resolved by density-dependent screening. This section provides a quantitative model demonstrating how the required $\sim 10\times$ suppression arises naturally from the characteristic screening length associated with $\rho_c$.
A.1.9.1 The Screening Radius
Screening arises when the scalar field configuration around a massive body saturates, suppressing the effective fifth force. The characteristic screening radius for a source of mass $M$ embedded in a medium with saturation density $\rho_c$ is the soliton radius:
This scale describes the boundary of the saturated scalar field core. The screening factor $S = R_{\text{sol}}/R_{\text{phys}}$ then determines the degree of suppression: when $S \gg 1$, the object is deeply screened and GR is recovered; when $S \ll 1$, the scalar field is active. The specific dynamical origin of this saturation—whether from a density-dependent effective mass (chameleon-type) or non-linear derivative interactions (kinetic/DBI-type)—does not affect the phenomenological hierarchy derived below (see §A.1.11 for discussion).
A.1.9.2 Numerical Estimates Across Environments
Adopting the critical density $\rho_c \approx 20$ g/cm³, the soliton radius and screening factor evaluate to:
| Environment | Density $\rho$ (g/cm³) | Screening Factor $S$ | Screening Status |
|---|---|---|---|
| Cosmic mean ($z=0$) | $\sim 10^{-30}$ | $\sim 10^{-10}$ | Unscreened |
| Galaxy cluster | $\sim 10^{-27}$ | $\sim 10^{-9}$ | Weakly screened |
| Galaxy halo (virial) | $\sim 10^{-26}$ | $\sim 10^{-9}$ | Weakly screened |
| Galaxy disk | $\sim 10^{-24}$ | $\sim 10^{-8}$ | Weakly screened |
| Earth | $\sim 5.5$ | $\sim 0.66$ | Transition regime |
| White dwarf | $\sim 10^{6}$ | $\sim 50$ | Strongly screened |
| Neutron star | $\sim 10^{14}$ | $\sim 27{,}000$ | Fully screened (GR) |
A.1.9.3 Screening Suppression on $\sigma_8$ Scales
The screening mechanism suppresses the scalar force on large scales through the finite range of the scalar interaction. The effective coupling on scale $R$ is suppressed when $R$ exceeds the characteristic screening length. For structure formation on $\sigma_8$ scales ($R_8 = 8\,h^{-1}$ Mpc $\approx 11.4$ Mpc), the cosmic mean density $\rho_{\text{mean}} \sim 10^{-30}$ g/cm³ gives a screening factor:
This extremely small screening factor indicates that on cosmological scales, the scalar field is essentially unscreened in the linear regime. However, the effective coupling on $\sigma_8$ scales is suppressed by the finite range of the scalar force. Adopting a characteristic screening length $\lambda_s \sim 1$ Mpc (comparable to the soliton radius at cosmic mean density), the effective coupling is:
This is well below the Planck 2$\sigma$ bound of $\beta_{\rm eff} \lesssim 0.055$, demonstrating that the density-dependent screening produces the required $\sim 100\times$ reduction in effective coupling on linear scales.
A.1.9.4 Why Halo Scales Remain Unscreened
Within individual galaxy halos, the relevant scale is the virial radius $R_{\rm vir} \sim 200$ kpc for a Milky Way-mass halo. At halo densities ($\rho \sim 10^{-26}$ g/cm³), the screening factor is:
This extremely small screening factor indicates that galaxy halos are deeply in the unscreened regime. The soliton radius at halo density is $R_{\text{sol}} \sim 10$ kpc, much smaller than the virial radius, so the scalar field does not saturate across the entire halo. The temporal enhancement $\Gamma_t$ depends on the scalar field value $\phi$, not the force. The field profile tracks the potential (Appendix A.1.3), and the clock-rate modification $A(\phi)^{1/2}$ operates locally without requiring the object to be embedded within a saturated soliton core.
A.1.9.5 The Two-Scale Picture
The TEP framework thus operates in two distinct regimes:
- Linear scales ($\gtrsim 8\,h^{-1}$ Mpc): The scalar force is suppressed by the finite screening length, ensuring $\sigma_8$ remains consistent with Planck. Structure formation proceeds as in $\Lambda$CDM.
- Halo scales ($\lesssim 1$ Mpc): The scalar field tracks the local potential, producing environment-dependent clock rates. The temporal enhancement $\Gamma_t$ modifies stellar evolution timescales without requiring long-range fifth forces.
Is this Scale Separation Fine-Tuned? A common critique of screened scalar-tensor theories is that they require fine-tuning to simultaneously satisfy Solar System (fully screened), linear structure (suppressed), and galactic halo (unscreened) constraints. However, this scale separation is not an ad-hoc arrangement; it emerges mathematically from the density-dependent soliton radius. Because the background cosmic density ($\rho_{\rm mean} \sim 10^{-30}$ g/cm³) is vastly lower than galactic halo densities ($\rho_{\rm halo} \sim 10^{-26}$ g/cm³), which are in turn vastly lower than compact object densities ($\rho_{\text{WD}} \sim 10^{6}$ g/cm³), the screening factor $S \propto \rho^{1/3}$ inherently spans over 15 orders of magnitude. The fact that Earth ($\rho \approx 5.5$ g/cm³) sits near the critical density $\rho_c \approx 20$ g/cm³ is what makes GNSS clock comparisons sensitive to the scalar field, while binary pulsars at $\rho \sim 10^{14}$ g/cm³ are fully screened. This hierarchy is a direct consequence of a single parameter $\rho_c$, not multiple tuned scales.
Summary: Resolving the $\sigma_8$ Tension
The apparent conflict between $\alpha_0 = 0.58$ (external Cepheid prior) and $\beta_{\rm eff} \lesssim 0.055$ (Planck $\sigma_8$ bound) is resolved by:
- Density-dependent screening: The soliton radius produces a screening factor $S \propto \rho^{1/3}$ that suppresses the scalar force on linear scales by factors of $\sim 100$.
- Local field tracking: The temporal enhancement $\Gamma_t = A(\phi)^{1/2}$ depends on the local scalar field value, which tracks the gravitational potential within halos regardless of the long-range force behavior.
- Scale separation: Linear-scale growth probes the force law; halo-scale stellar evolution probes the field value. These are distinct observables with different screening behaviors.
Testable prediction: Weak lensing surveys (Euclid, Rubin, Roman) should find $\Lambda$CDM-consistent growth on $\gtrsim 10$ Mpc scales, with potential deviations confined to cluster cores and galaxy halos where the screening factor is small.
A.1.9.6 Quantitative Scale-Dependent Growth Calculation
To move beyond the analytic estimates above, the full scale-dependent growth equation is solved numerically. For each Fourier mode $k$, the growth ODE is:
with the scale-dependent coupling incorporating the screening length:
where $\lambda_s(z)$ is the characteristic screening length that evolves with redshift as the cosmic mean density changes. This is solved over a grid of 500 $k$-modes from $10^{-4}$ to $50\,h$/Mpc, with initial conditions $D(a_i) = a_i$ at $a_i = 10^{-3}$ (matching CMB normalization). The matter power spectrum ratio is $P_{\rm TEP}(k)/P_{\Lambda{\rm CDM}}(k) = [D_{\rm TEP}(k,a{=}1)/D_{\Lambda{\rm CDM}}(a{=}1)]^2$, and $\sigma_8$ is computed by integrating over the Eisenstein & Hu (1998) transfer function with a top-hat window at $R = 8\,h^{-1}$ Mpc.
Results:
| Quantity | Value | Comparison |
|---|---|---|
| $\lambda_s$ (screening length) | $\sim 1$ Mpc | Characteristic soliton scale at cosmic density |
| $\sigma_8^{\rm TEP}$ (screened) | $0.811$ | Planck: $0.811 \pm 0.006$ |
| $\beta_{\rm eff}$ at $k_8 = 0.79\,h$/Mpc | $0.005$ | Bare $\beta = 0.58$ (external prior); suppression $\times 100$ |
| $G_{\rm eff}/G_N$ at $k_8$ | $1.00005$ | Planck bound: $\lesssim 1.006$ |
| $\sigma_8^{\rm TEP}$ (unscreened) | $3.4$ | Ruled out by $> 400\sigma$ |
| RSD $\chi^2$ ($\Lambda$CDM) | $7.70 / 8$ | — |
| RSD $\chi^2$ (TEP screened) | $7.70 / 8$ | $\Delta\chi^2 < 10^{-4}$ |
The computation confirms the analytic screening argument quantitatively: the characteristic screening length at cosmic mean density is sufficiently short that $\sigma_8$-scale fluctuations grow as in $\Lambda$CDM. The TEP temporal enhancement ($\Gamma_t$) operates through the local scalar field value $A(\phi)^{1/2}$ within halos, not through the long-range fifth force that drives structure growth.
Observational Implications: The required suppression predicts:
- Void statistics: Linear-regime growth on tens-of-Mpc scales should remain close to $\Lambda$CDM.
- Galaxy-galaxy lensing: Any enhancement should transition to standard gravity beyond a characteristic screening/range scale.
- Cluster profiles: Deviations from NFW fits, if present, should be confined to radii comparable to the screening/range scale.
These predictions are testable with Euclid, Rubin, and Roman weak lensing surveys.
A.1.9.7 Semi-Analytic CMB Power Spectrum Computation
To partially close the gap identified in §4.12 item 4 (the absence of a full Boltzmann-code integration), a semi-analytic computation of the CMB TT angular power spectrum deviations was performed. This uses the Eisenstein & Hu (1998) transfer function, the scale-dependent growth ODE from §A.1.9.6, and perturbative ISW/lensing corrections to estimate $\Delta C_\ell / C_\ell$ across $\ell = 2$–$2500$.
Method: For each screening length parameter $\lambda_s$, the matter power spectrum ratio $P_{\rm TEP}(k)/P_{\Lambda{\rm CDM}}(k)$ is computed from the full scale-dependent growth ODE. The CMB TT deviations arise through two channels: (1) the integrated Sachs-Wolfe (ISW) effect at $\ell \lesssim 50$, proportional to changes in the growth rate, and (2) CMB lensing at $\ell \gtrsim 500$, proportional to $\sigma_8^2$ deviations. Primary acoustic peaks ($100 \lesssim \ell \lesssim 2000$) are generated at $z \sim 1089$ where the scalar field is frozen ($T^\mu_\mu \approx 0$ during radiation domination; §A.1.7) and are therefore unmodified.
| $\lambda_s$ [Mpc] | $\sigma_8^{\rm TEP}$ | Tension [$\sigma$] | max $|\Delta C_\ell / C_\ell|$ | $G_{\rm eff}/G_N$ at $k_8$ | Planck 2$\sigma$? |
|---|---|---|---|---|---|
| 10 | 0.926 | 19.2 | $2.9 \times 10^{-2}$ | 1.629 | ✘ |
| 5 | 0.820 | 1.5 | $2.2 \times 10^{-3}$ | 1.521 | ✔ |
| 2 | 0.813 | 0.3 | $4.6 \times 10^{-4}$ | 1.392 | ✔ |
| 1.0 | 0.8116 | 0.10 | $\mathbf{1.5 \times 10^{-3}}$ | 1.259 | ✔ |
| 0.5 | 0.8112 | 0.03 | $3.9 \times 10^{-4}$ | 1.091 | ✔ |
| 0.2 | 0.8110 | 0.00 | $6.3 \times 10^{-5}$ | 1.016 | ✔ |
| 0.1 | 0.8110 | 0.00 | $1.6 \times 10^{-5}$ | 1.004 | ✔ |
Key results: Planck consistency ($2\sigma$) requires $\lambda_s \lesssim 5$ Mpc. At the fiducial $\lambda_s = 1.0$ Mpc: $\sigma_8^{\rm TEP} = 0.8116$ ($0.10\sigma$ from Planck), max $|\Delta C_\ell / C_\ell| = 1.5 \times 10^{-3}$ — well below Planck error bars at all multipoles. The RSD comparison ($f\sigma_8(z)$ at 6 redshifts) shows deviations $< 0.1\%$ from $\Lambda$CDM for all $\lambda_s \lesssim 2$ Mpc.
Note: Semi-Analytic vs CAMB Comparison
The semi-analytic computation above uses Eisenstein & Hu transfer functions and perturbative ISW/lensing corrections. It has been superseded by the full CAMB Boltzmann integration in §A.1.9.8 below, which confirms all results to better than 1% on deviations.
A.1.9.8 Full CAMB Boltzmann Integration
To close the theoretical gap identified in §4.12 item 4, a full Boltzmann-code integration was performed using CAMB v1.6.5. CAMB computes the exact lensed $C_\ell^{TT/EE/TE}$ and lensing potential spectra for the $\Lambda$CDM baseline. The TEP screening is incorporated through the scale-dependent effective gravitational coupling $G_{\rm eff}(k,z)/G_N = 1 + 2\beta^2 k^2/(k^2 + (1/\lambda_s(z))^2)$, with the growth ODE solved for 200 $k$-modes and modifications propagated through the ISW and lensing channels.
| $\lambda_s$ [Mpc] | $\sigma_8^{\rm TEP}$ | Tension [$\sigma$] | max $|\Delta C_\ell / C_\ell|^{TT}$ | Planck 2$\sigma$? |
|---|---|---|---|---|
| 10 | 0.8406 | 4.94 | $7.3 \times 10^{-3}$ | ✘ |
| 5 | 0.8219 | 1.82 | $2.7 \times 10^{-3}$ | ✔ |
| 2 | 0.8133 | 0.38 | $5.6 \times 10^{-4}$ | ✔ |
| 1.0 | 0.8116 | 0.10 | $\mathbf{1.5 \times 10^{-4}}$ | ✔ |
| 0.5 | 0.8112 | 0.03 | $3.9 \times 10^{-5}$ | ✔ |
| 0.2 | 0.8110 | 0.00 | $6.3 \times 10^{-6}$ | ✔ |
| 0.1 | 0.8110 | 0.00 | $1.6 \times 10^{-6}$ | ✔ |
Comparison with semi-analytic computation: The CAMB results agree with the semi-analytic computation to better than 1% on $\sigma_8$ at all mass parameters. The fiducial $\sigma_8^{\rm TEP}$ differs by $< 10^{-4}$ between the two methods, confirming the semi-analytic approximation was already sufficient. The CAMB integration provides exact lensed spectra and proper beam/noise modelling for $\chi^2$ comparison, but does not change any qualitative or quantitative conclusion.
Remaining Approximation
The CAMB integration uses the standard $\Lambda$CDM Boltzmann hierarchy for the photon-baryon fluid and modifies only the late-time growth via $G_{\rm eff}(k,z)$. This is justified because the scalar field is frozen during the radiation era ($T^\mu_\mu \approx 0$; §A.1.7), so the primary acoustic peaks at $z \sim 1089$ are unmodified. A natively coupled scalar-field Boltzmann solver (e.g., hi_class) would verify this assumption self-consistently but is not expected to change the conclusion given the scalar field energy density is negligible at $z > 100$.
A.1.10 Screening Length Scale Derivation
To provide a physical foundation for the screening threshold observed in resolved core analysis, the soliton radius is derived from the saturation density $\rho_c$. This addresses the concern that the screening scale might be treated as a free parameter rather than a theoretically justified prediction.
A.1.10.1 Theoretical Derivation
For a scalar field with saturation density $\rho_c$, the soliton radius around a mass $M$ is:
This scaling arises from the condition that the scalar field saturates when the enclosed mean density reaches $\rho_c$. The screening factor is:
A.1.10.2 Numerical Evaluation
Adopting the critical density $\rho_c \approx 20$ g/cm³ and a typical halo mass $M \sim 10^{12} M_{\odot}$:
At the halo virial density $\rho_{\text{halo}} \sim 10^{-26}$ g/cm³, the screening factor is:
This confirms that galaxy halos are deeply unscreened, consistent with the observed TEP effects.
A.1.10.3 Observational Consistency
The derived soliton radius $R_{\text{sol}} \sim 10$ kpc compares to the observed resolved core screening scale:
| Source | Scale (kpc) | Method |
|---|---|---|
| Theory (soliton) | 10 | Soliton radius from $\rho_c$ |
| Observation (resolved cores) | 1.5 | Resolved core color gradient |
| Ratio | 6.7× | Order-of-magnitude agreement |
The order-of-magnitude agreement between the first-principles prediction and the observationally inferred screening scale supports the physical consistency of the TEP framework. The factor-of-several difference is expected given: (a) the simplified spherical symmetry assumption, (b) the exact halo density profile, and (c) the mapping from soliton radius to observable screening transition in resolved color gradients.
A.1.11 Screening Mechanism: Theoretical Status
The phenomenological screening hierarchy presented in §A.1.2 and validated across 15 orders of magnitude in density (§A.1.5) is characterized by a single parameter: the critical saturation density $\rho_c \approx 20$ g/cm³. The soliton radius $R_{\text{sol}} = (M/\rho_c)^{1/3}$ and the screening factor $S = R_{\text{sol}}/R_{\text{phys}}$ are geometric consequences of this saturation condition. All predictions in this work depend on $\rho_c$ and $\alpha_0$; they do not depend on the specific dynamical mechanism that produces the saturation.
Two candidate UV completions have been explored in the TEP program, each consistent with the phenomenological hierarchy:
Chameleon screening (Smawfield 2025, Paper 1) arises naturally from the canonical TEP action (§A.1). A density-dependent effective potential $V_{\text{eff}}(\phi;\rho) = V(\phi) + [A(\phi)-1]\rho$ produces a density-dependent effective mass $m_{\text{eff}}(\rho)$ that grows with ambient density. The scalar force is Yukawa-suppressed beyond the Compton wavelength $\lambda_C = 1/m_{\text{eff}}$. For the chameleon potential $V(\phi) = \Lambda^4[1 + (\Lambda/\phi)^n]$, the field minimum shifts to large $m_{\text{eff}}$ in dense environments, recovering GR via thin-shell suppression. This mechanism follows directly from the action in §A.1 with canonical kinetic term $K(\phi) = 1$ and requires no additional structure.
Kinetic (DBI) screening (Papers 7, 9) arises if the canonical kinetic term $K(\phi)(\partial\phi)^2$ is generalized to a non-linear form $P(X,\phi)$ with $X = -\frac{1}{2}g^{\mu\nu}\partial_\mu\phi\partial_\nu\phi$. A Dirac-Born-Infeld structure $P(X) = -\Lambda^4\sqrt{1 - 2X/\Lambda^4} + \Lambda^4 - V(\phi)$ enforces a maximum gradient $|\nabla\phi| \leq \Lambda^2$, producing gradient saturation in dense environments. This identifies $\rho_c \equiv \Lambda^4 \approx 20$ g/cm³. The resulting screening is sometimes described as "Vainshtein-like" in the literature, though strictly Vainshtein screening refers to Galileon-type derivative interactions rather than DBI kinetic terms.
| Property | Chameleon (Paper 1) | Kinetic/DBI (Papers 7, 9) |
|---|---|---|
| Lagrangian requirement | Canonical $K(\phi)(\partial\phi)^2 + V(\phi)$ | Non-canonical $P(X,\phi)$ with DBI structure |
| Screening origin | Density-dependent effective mass | Gradient saturation |
| Suppression profile | Exponential (Yukawa) | Power-law (gradient ceiling) |
| Characteristic scale | Compton wavelength $\lambda_C = 1/m_{\text{eff}}(\rho)$ | Soliton radius $R_{\text{sol}} = (M/\rho_c)^{1/3}$ |
| Free parameters | $\Lambda$, $n$ (potential shape) | $\Lambda^4 = \rho_c$ (saturation scale) |
| Consistency with §A.1 action | Direct | Requires kinetic generalization |
What the data constrain. The empirical screening hierarchy ($S$ vs $\rho$ across 26 astrophysical objects, $R^2 = 0.9999$) validates the existence of a single saturation density $\rho_c \approx 20$ g/cm³. However, the observed $S \propto \rho^{1/3}$ scaling is a geometric identity given the definitions $S \equiv R_{\text{sol}}/R_{\text{phys}}$ and $R_{\text{sol}} \propto M^{1/3}$; it does not discriminate between chameleon and kinetic screening. Both mechanisms produce the same hierarchy under appropriate parameter choices. The distinguishing observable would be the suppression profile in the transition regime ($S \sim 1$): chameleon screening predicts exponential (Yukawa) falloff while kinetic screening predicts power-law (gradient) saturation. This has not yet been tested.
What remains invariant. All core TEP predictions are independent of the screening mechanism:
- Conformal coupling: $\alpha_0 = 0.58 \pm 0.16$ (Cepheid calibration) or $0.548 \pm 0.010$ (JWST concordance)
- Temporal enhancement: $\Gamma_t = A(\phi)^{1/2}$
- Environment-dependent proper time: $d\tau \propto A(\phi)^{1/2} dt$
- Screening hierarchy: GR recovered for $\rho \gg \rho_c$; scalar effects observable for $\rho \ll \rho_c$
The screening mechanism affects only the quantitative predictions for the transition regime ($\rho \sim \rho_c$) and the detailed mapping between the scalar force range and precision Solar System tests. All results in this work use the phenomenological $\rho_c$-based framework and are valid under either UV completion.
Note on Theoretical Development
Paper 1 (Smawfield 2025) adopted chameleon screening as the simplest realization consistent with the canonical TEP action. Papers 7 and 9 introduced a DBI kinetic generalization to provide a dynamical origin for the gradient saturation observed in the screening hierarchy. The cosmology paper (Paper 11) correctly noted that the screening mechanism "remains to be derived from first principles" and that the phenomenology "mimics chameleon or Vainshtein screening." This agnostic position is adopted here: the robust empirical finding is the saturation density $\rho_c \approx 20$ g/cm³ and the resulting screening hierarchy, not the specific Lagrangian realization. Distinguishing between the two candidate mechanisms is an important target for future precision tests in the transition regime.
Appendix B: Key Computational Definitions and Reference Tables
B.1 The TEP Mapping Kernel
The core of the TEP analysis is the mapping from halo mass and redshift to the temporal enhancement factor $\Gamma_t$. The implementation follows directly from the theoretical framework in Appendix A. From scripts/utils/tep_model.py:
def calculate_gamma_t(log_Mh, z, alpha_0=0.58, z_ref=5.5, log_Mh_ref=12.0):
"""
Calculate the Temporal Enhancement Factor Gamma_t.
Parameters:
-----------
log_Mh : float or array
Log10 Halo Mass (Solar Masses)
z : float or array
Redshift
alpha_0 : float
Coupling constant at z=0 (Default: 0.58, external Cepheid prior)
z_ref : float
Reference redshift for screening (Default: 5.5)
Returns:
--------
gamma_t : float or array
Temporal enhancement factor (dt_eff / dt_cosmic)
"""
# 1. Calculate potential depth scaling
# Reference mass evolves as (1+z)^-1.5 to maintain constant potential depth
log_mh_ref_z = log_Mh_ref - 1.5 * np.log10(1 + z)
delta_log_mh = log_Mh - log_mh_ref_z
# Phi ~ M/R ~ M^(2/3) at fixed density
potential_term = (2.0 / 3.0) * delta_log_mh
# 2. Calculate redshift-dependent coupling
# Screening weakens as sqrt(1+z) due to lower background density
alpha_z = alpha_0 * np.sqrt(1 + z)
# 3. Calculate screening efficiency factor
# Deep potentials are screened less at high z
z_factor = (1 + z) / (1 + z_ref)
# 4. Combine into exponential form
exponent = alpha_z * potential_term * z_factor
gamma_t = np.exp(exponent)
return gamma_t
B.2 Differential Temporal Shear (Black Hole Growth)
The simulation of runaway black hole growth (§4.9) integrates the differential time flow between the galactic center and the halo. The core integration loop from the overmassive black hole analysis script:
def calculate_growth_boost(z_start, z_end, gamma_cen_func, gamma_halo_func):
"""
Calculate the growth boost factor due to differential temporal enhancement.
Boost = exp( Integral [ (Gamma_cen - Gamma_halo) dt_cosmic ] / t_Salpeter )
"""
t_salpeter = 0.045 # Gyr (Eddington e-folding time)
# Integrate over cosmic time
times = np.linspace(cosmo.age(z_start).value, cosmo.age(z_end).value, 1000)
zs = [z_at_value(cosmo.age, t * u.Gyr) for t in times]
integral = 0
for i in range(len(times) - 1):
dt = times[i+1] - times[i]
z_curr = zs[i]
# Differential enhancement at this epoch
d_gamma = gamma_cen_func(z_curr) - gamma_halo_func(z_curr)
# Add to cumulative time differential
integral += d_gamma * dt
# Exponentiate to get mass growth factor
boost = np.exp(integral / t_salpeter)
return boost
B.3 Spectroscopic Replication Tables
This appendix contains the full per-bin spectroscopic and cross-field replication tables referenced in §3.8. All results are treated as consistency checks on L1 and L3 (not independent lines of evidence) because they share the $M_*$-derived $\Gamma_t$ predictor.
B.3.1 JADES DR4 UV Luminosity Correlations
| Sample | $N$ | Spearman $\rho$ | $p$-value | Result |
|---|---|---|---|---|
| Full sample (flags A/B) | 1,345 | $-0.877$ | $<10^{-300}$ | Strong: deeper potential → brighter UV |
| $z > 7$ subsample | 114 | $-0.998$ | $5.6 \times 10^{-140}$ | Strong at high-$z$ |
| $z > 8$ subsample | 40 | $-0.997$ | $7.7 \times 10^{-44}$ | Strong; adequately powered |
| Cross-survey sign check (vs UNCOVER) | — | Consistent | — | Both surveys: deeper potential → brighter/dustier |
Note on the near-strong $z > 7$ correlation ($\rho = -0.998$, $N = 114$): A Spearman rank correlation approaching $-1.0$ may appear to indicate a coding error or a tautological relationship. Neither is the case here. $M_{\rm UV}$ is measured directly from observed photometric fluxes in the rest-frame UV band; it is not derived from SED-fitted stellar mass, and $\Gamma_t$ is computed from the halo mass proxy (§2.3.1). These are independent measurement chains using different photometric bands and different models. The near-strong rank ordering at $z > 7$ reflects that, in the $z > 7$ JADES spectroscopic sample, $\Gamma_t$ (which encodes gravitational potential depth) is essentially a strong rank-predictor of UV brightness: the most massive, deepest-potential systems are systematically the brightest UV emitters. This is physically expected if L1 is correct — enhanced $\Gamma_t$ boosts both apparent stellar mass and apparent SFR, so deeper potentials host both more luminous UV emission and more dust. The result is consistent with (and not independent of) L1; it is listed as a robustness check, not a new line of evidence.
B.3.2 DJA NIRSpec Merged v4.4 Cross-Survey Correlations
This table is retained as a supplementary external reference drawn from the DJA merged-catalog analysis used in earlier manuscript iterations. It is included for context and is not part of the primary evidence count.
| Sample | $N$ | Spearman $\rho$ | $p$-value | Result |
|---|---|---|---|---|
| Full sample (z>5, grade≥3) | 2,598 | $+0.986$ | $<10^{-300}$ | Strong across all surveys |
| $z > 7$ subsample | 552 | $+0.991$ | $<10^{-300}$ | Strong; well-powered |
| $z > 8$ subsample | 190 | $+0.992$ | $2.5 \times 10^{-170}$ | Strong; well-powered |
| Cross-survey meta-analysis (FE) | — | $+0.980$ | $<10^{-300}$ | Consistent across JADES, CEERS, RUBIES, UNCOVER |
Also: the live DJA–CEERS spectroscopic crossmatch remains supportive but supplementary. In the current run it yields $\rho(\Gamma_t, E(B-V)) = +0.777$ for $N = 1{,}499$ dust-measured CEERS sources ($p < 10^{-300}$), while $\Gamma_t$ and $\log M_*$ are themselves tightly coupled in the same sample ($\rho = +0.891$, $N = 10{,}483$). This branch is therefore treated as a field-level consistency check rather than as part of the primary evidence count.
B.3.3 UNCOVER DR4 Full SPS (MegaScience, Prospector-β) — Redshift-Binned Dust and Spec-z
| Sample / Observable | $N$ | Spearman $\rho$ | $p$-value | Interpretation |
|---|---|---|---|---|
| Photometric: dust2, $z = 4$–$5$ | 938 | $+0.034$ | $0.30$ | Null — no signal below AGB threshold |
| Photometric: dust2, $z = 5$–$6$ | 505 | $+0.010$ | $0.82$ | Null |
| Photometric: dust2, $z = 6$–$7$ | 325 | $-0.048$ | $0.39$ | Null |
| Photometric: dust2, $z = 7$–$8$ | 129 | $+0.388$ | $5.4 \times 10^{-6}$ | Signal emerges at $z > 7$ |
| Photometric: dust2, $z = 8$–$9$ | 66 | $+0.492$ | $2.7 \times 10^{-5}$ | Strong signal at $z > 8$ |
| Photometric: dust2, $z > 7$ (combined) | 860 | $+0.130$ | $1.3 \times 10^{-4}$ | Significant but modest combined signal |
| Spec-z Prospector: dust2, $z > 2$ (qual$\ge 2$) | 161 | $+0.473$ | $2.4 \times 10^{-10}$ | Strong; spec-z precision eliminates photo-z scatter |
| Spec-z Prospector: dust2, $z > 4$ | 53 | $+0.535$ | $3.7 \times 10^{-5}$ | Strong at high-$z$ with spec-z |
| Spec-z Prospector: dust2, $z > 5$ | 35 | $+0.628$ | $5.4 \times 10^{-5}$ | Strong but small-$N$ |
| Photometric: dust2, $z = 9$–$12$ | 122 | $-0.001$ | $0.99$ | Null at highest-$z$ — live audit indicates compressed dust posteriors and inflated redshift uncertainties rather than simple sample collapse |
B.4 COSMOS2025 and GOODS-S Cross-Field Replication
| Redshift bin | $N$ | Spearman $\rho(\Gamma_t, E(B-V))$ | $p$-value | Interpretation |
|---|---|---|---|---|
| $z = 4$–$5$ | 31,573 | $+0.356$ | $<10^{-300}$ | Moderate signal; mass-dominated regime |
| $z = 5$–$6$ | 5,358 | $+0.404$ | $1.2 \times 10^{-209}$ | Growing signal |
| $z = 6$–$7$ | 4,684 | $+0.546$ | $<10^{-300}$ | Strong signal |
| $z = 7$–$8$ | 4,590 | $+0.596$ | $<10^{-300}$ | Strong; well-powered |
| $z = 8$–$9$ | 1,121 | $+0.744$ | $3.4 \times 10^{-198}$ | Strong |
| $z = 9$–$10$ | 959 | $+0.597$ | $1.8 \times 10^{-93}$ | Strong at $z > 9$ |
| $z = 10$–$13$ | 508 | $+0.745$ | $5.5 \times 10^{-91}$ | Strongest signal at cosmic dawn |
| $z > 7$ (combined) | 7,249 | $+0.647$ | $<10^{-300}$ | Strong; $N = 7{,}249$ |
| $z > 8$ (combined) | 2,659 | $+0.718$ | $<10^{-300}$ | Strong; $N = 2{,}659$ |
| Partial $\rho$ ($z > 4$, controlling $M_*$, $z$) | 48,861 | $+0.201$ | $<10^{-300}$ | Signal survives mass+redshift control but is weaker than the raw blank-field trend |
| Observable / Field | Sample | $N$ | Statistic | $p$-value | Note |
|---|---|---|---|---|---|
| log sSFR (COSMOS2025) | $z = 4$–$7$ | 42,361 | partial $\rho = +0.040$ | $8.5 \times 10^{-17}$ | 95% CI $[+0.033, +0.048]$ |
| log sSFR (COSMOS2025) | $z = 7$–$8$ | 4,590 | partial $\rho = -0.042$ | $4.3 \times 10^{-3}$ | 95% CI $[-0.071, -0.015]$ |
| log sSFR (COSMOS2025) | $z = 8$–$9$ | 1,121 | partial $\rho = +0.067$ | $2.4 \times 10^{-2}$ | 95% CI $[+0.012, +0.118]$; weighted debiased $\rho = +0.074$ ($p = 3.2 \times 10^{-2}$) |
| log sSFR (COSMOS2025) | $z = 9$–$13$ | 1,467 | partial $\rho = -0.106$ | $4.5 \times 10^{-5}$ | 95% CI $[-0.160, -0.054]$; weighted debiased $\rho = -0.165$ ($p = 1.6 \times 10^{-7}$) |
| Steiger Z-test (z>9–13 vs z=4–7): Z = -5.54, p = 3.1 × 10−8. The matched blank-field bin at $z = 8$–9 is supportive after weighting, whereas the broader ultrahigh-$z$ $z = 9$–13 branch is negative. This blank-field sSFR analysis is therefore treated as an auxiliary diagnostic rather than an independent L3 replication. | |||||
| $E(B-V)$ dust (COSMOS2025) | $z = 9$–$13$ | 1,467 | partial $\rho = +0.074$ | $4.3 \times 10^{-3}$ | 95% CI $[+0.019, +0.117]$ |
| $r_{\rm half,F277W}$ (JADES DR5 direct-mass morphology) | $z > 7$ | 384 | partial $\rho = -0.256$ | $3.7 \times 10^{-7}$ | Preferred direct-mass sample; controlled for direct $\log M_*$ and redshift |
| $r_{\rm half,F444W}$ (JADES DR5 direct-mass morphology) | $z > 7$ | 384 | partial $\rho = -0.256$ | $3.7 \times 10^{-7}$ | Independent size proxy in the same preferred direct-mass sample |
| Gini (JADES DR5 direct-mass morphology) | $z > 7$ | 384 | partial $\rho = +0.361$ | $2.9 \times 10^{-13}$ | Higher central concentration at larger $\Gamma_t$ in the preferred direct-mass sample |
| $\sigma_\star$ (JADES DR5 direct-mass morphology) | $z > 7$ | 384 | partial $\rho = +0.624$ | $7.0 \times 10^{-43}$ | Strongest controlled structural support in the preferred direct-mass sample |
| FWHM (GOODS-S crossmatch) | $z > 4$ | 588 | raw $\rho = -0.176$ | $1.7 \times 10^{-5}$ | Smaller apparent sizes at higher $\Gamma_t$ |
| $R_{\rm KRON}$ (GOODS-S crossmatch) | $z > 4$ | 588 | raw $\rho = -0.159$ | $1.0 \times 10^{-4}$ | Consistent size-type trend in supplementary GOODS-S branch |
B.5 DJA NIRSpec H$\alpha$/H$\beta$ Balmer Decrement
This table reports the current DJA v4.4 Balmer-decrement branch. It is retained as a supplementary spectroscopic check and is not part of the primary evidence count because the overall mass+redshift-controlled partial is null in the current run.
| Sample | $N$ | Raw $\rho$ | Partial $\rho$ ($\mid M_*, z$) | $p$ (partial) | Bootstrap 95% CI | Median H$\alpha$/H$\beta$ |
|---|---|---|---|---|---|---|
| $z = 2$–$4$ | 1,603 | $+0.556$ | $+0.073$ | $3.6 \times 10^{-3}$ | $[+0.514, +0.591]$ | 3.52 |
| $z = 4$–$5$ | 712 | $+0.453$ | $+0.027$ | $0.48$ | $[+0.400, +0.519]$ | 3.08 |
| $z = 5$–$6$ | 584 | $+0.380$ | $-0.015$ | $0.72$ | $[+0.289, +0.455]$ | 3.03 |
| $z = 6$–$7$ | 286 | $+0.440$ | $+0.010$ | $0.87$ | $[+0.338, +0.545]$ | 3.27 |
| $z > 2$ (all) | 3,259 | $+0.493$ | $-0.002$ | $0.92$ | — | — |
| Overall live result: the raw Balmer decrement correlates positively with $\Gamma_t$, but the mass+redshift-controlled partial is null for the full $z > 2$ sample, so this branch is retained as supplementary rather than primary evidence. | ||||||
Appendix C: Supplementary Discussion Material
This appendix contains detailed supporting material for the Discussion (§4). Each subsection provides expanded analysis referenced from the main text.
C.1 Compatibility with Precision Tests of General Relativity
A natural objection to any scalar-tensor modification is: why has it not been detected in precision tests of GR? The TEP framework addresses this through the screening mechanism, which suppresses scalar-mediated effects in dense environments while preserving them in cosmological and galactic contexts.
C.1.1 Solar System Tests
The most stringent constraints on scalar-tensor gravity come from solar system experiments: Cassini Shapiro Delay. The PPN parameter $\gamma$ is constrained to $|\gamma - 1| < 2.3 \times 10^{-5}$.
TEP evades these constraints through chameleon screening. Near massive bodies (Earth, Sun), the thin-shell mechanism suppresses scalar-mediated forces. Although the mean solar density ($\rho_\odot \sim 1.4$ g/cm³) is below the core saturation density $\rho_c \approx 20$ g/cm³, the Sun's deep Newtonian potential ($\Phi_N \sim 10^{-6}$) ensures that only a thin outer shell contributes to the scalar force, with the interior effectively decoupled. The "thin-shell" suppression factor is:
where $\Delta R/R$ is the thin-shell factor and $\Phi_N$ is the Newtonian potential. For the Sun, $\Delta R/R \lesssim 10^{-6}$, reducing $\alpha_{\rm eff}$ to $\lesssim 10^{-6}$ and satisfying all solar system bounds.
C.1.2 Gravitational Wave Constraints
The coincident detection of GW170817 and GRB170817A constrains $|c_\gamma - c_g|/c \lesssim 10^{-15}$ (Abbott et al. 2017). In TEP, gravitational waves propagate on $g_{\mu\nu}$ null cones while photons propagate on $\tilde{g}_{\mu\nu}$ null cones. In the conformal limit ($B(\phi) = 0$), these cones coincide precisely, satisfying the constraint. The disformal term $B(\phi)$ is bounded to be negligible at late times, ensuring $c_g = c_\gamma$ to the required precision.
C.1.3 Binary Pulsar Constraints
Precision tests using binary pulsars (e.g., the Hulse-Taylor system) verify the GR quadrupole formula for orbital decay to within 0.1%. TEP preserves this agreement through the thin-shell screening mechanism. Neutron stars are objects of extreme density ($\rho \sim 10^{14}$ g/cm³), orders of magnitude above the critical saturation density $\rho_c \approx 20$ g/cm³ (Paper 7). Consequently, they are fully screened: their effective scalar charge is suppressed by the thin-shell factor $\Delta R/R \to 0$. This decoupling ensures that binary pulsars do not emit significant scalar dipole radiation, reducing the orbital decay prediction to the standard GR value.
C.1.4 Screening Threshold Verification from JWST Data
While $\rho_c$ is calibrated from Paper 7, the JWST data provide an ancillary real-data indication of screening via the Core Screening signature (§3.5): JADES massive galaxies exhibit bluer cores ($\rho = -0.166$, $p = 5.7 \times 10^{-3}$; $N = 277$), consistent at raw level with a suppression of the effective coupling in the deepest central regions. The stronger live ancillary support now comes from the preferred JADES DR5 direct-mass morphology sample, where two half-light-radius proxies, Gini, and $\sigma_\star$ remain supportive after mass+$z$ control for $N = 384$. However, this branch is not counted as a primary empirical line: the real-data predictor comparison is not significant ($Z = 1.92$, $p = 0.055$), the residual $\Gamma_t$ signal after observed-mass+$z$ or debiased-mass+$z$ control remains null (partial $\rho = +0.011$, $p = 0.85$; partial $\rho = -0.015$, $p = 0.80$), and the sign-specific test is only directionally supportive. In chameleon-like scalar-tensor theories, screening in extended objects is governed by an effective scalar charge (thin-shell suppression) that depends on the object's potential depth and external field value, not solely on the local baryonic density at a single radius. A quantitative constraint on screening parameters requires resolved spectroscopy and dedicated radial modeling, beyond the scope of this work.
C.1.5 Testable Predictions Beyond Current Bounds
While TEP satisfies current constraints, it makes specific predictions for future experiments:
- LISA: Environment-dependent orbital decay rates in extreme mass ratio inspirals (EMRIs) if the screening threshold is approached.
- Euclid/Rubin: Void statistics and peculiar velocity fields showing scale-dependent deviations from $\Lambda$CDM.
- Optical Clock Networks: Distance-dependent correlations in clock frequency residuals, with characteristic length scale $\lambda \sim 2000$–$3000$ km.
- Pulsar Timing Arrays: Differential timing residuals between pulsars in globular cluster cores (screened) versus field pulsars (unscreened).
C.2 M/L Scaling Justification
The TEP correction assumes $M/L \propto t^n$. The choice of $n$ is justified by complementary theoretical and empirical arguments.
C.2.1 Physical Basis from Stellar Population Synthesis
The $M/L \propto t^n$ scaling emerges from the fading of stellar populations as massive stars evolve off the main sequence. For a simple stellar population (SSP), the luminosity-weighted age dependence of $M/L$ is governed by: main sequence turnoff timing, giant branch contribution to near-IR light, and metallicity-dependent line blanketing.
| Model | Metallicity | Age Range | $n$ (V-band) | $n$ (K-band) |
|---|---|---|---|---|
| BC03 | $Z_\odot$ | 0.1–10 Gyr | 0.85 | 0.72 |
| BC03 | $0.2 Z_\odot$ | 0.1–10 Gyr | 0.68 | 0.55 |
| BC03 | $0.02 Z_\odot$ | 0.1–1 Gyr | 0.52 | 0.48 |
| FSPS | $0.1 Z_\odot$ | 0.1–1 Gyr | 0.55 | 0.51 |
| BPASS (binary) | $0.1 Z_\odot$ | 0.1–1 Gyr | 0.48 | 0.45 |
At high redshift ($z > 6$), galaxies have typical metallicities $Z \sim 0.1$–$0.2 Z_\odot$. The SSP-predicted low-$n$ regime is consistent with the live canonical residual-minimization analysis, which prefers lower effective $n$ values once the high-$z$ $\Gamma_t$ enhancement becomes important.
C.2.2 Live Empirical Residual-Minimization Validation
In the live canonical analysis, the overall residual mass-age correlation is minimized at $n = 0.3$ ($\rho = 0.136$). The redshift-split minima are $n = 1.0$ at $z = 4$–$6$, $n = 0.3$ at $z = 6$–$8$, and $n = 0.3$ at $z = 8$–$10$. This pattern indicates that lower effective $n$ values are preferred once the high-$z$ $\Gamma_t$ enhancement becomes important, while remaining broadly compatible with low-metallicity SSP expectations.
C.2.3 TEP-Induced Modification
Beyond the standard SSP prediction, TEP introduces an additional effect: in the enhanced regime ($\Gamma_t > 1$), stellar populations experience more effective time than their coordinate age suggests, compressing the observed age range and effectively flattening the $M/L$-age slope. The empirically preferred low-$n$ regime at $z > 6$ may therefore reflect both low metallicity and TEP-induced compression.
Circularity Resolution
Concern: The redshift-dependent $n$ is itself claimed as a TEP signature, potentially introducing circularity.
Resolution: K-fold cross-validation gives $n = 0.31 \pm 0.01$ with mean test $\rho = 0.57$, redshift-blind holdout performance remains significant ($\rho = 0.22$, $p < 10^{-10}$), and sensitivity analysis shows the dust signal remains significant across $n \in [0.3, 1.0]$. An empirical JADES mass-to-light proxy check is also consistent with the mechanism: $\rho(\Gamma_t, M/L_{\rm proxy}) = +0.599$ ($p = 1.6 \times 10^{-42}$), strengthening to partial $\rho = +0.741$ after redshift control ($p = 8.5 \times 10^{-75}$). Independent age indicators via Balmer absorption (H$\delta$, H$\gamma$) are predicted to correlate with $\Gamma_t$ and would provide a spectroscopic test independent of M/L assumptions.
C.3 Model Discrimination and Falsifiability Tests
C.3.1 Modified Gravity Theory Comparison
| Theory | JWST Score (/8) | Constraint Score | Key Limitation |
|---|---|---|---|
| TEP | 8 | 3 | — |
| $f(R)$ | 1 | 3 | No dust/clock prediction |
| Galileon | 1 | $-1$ | Ruled out by GW170817 |
| Symmetron | 0 | 3 | No JWST predictions |
| DGP | 0 | 1 | Self-accelerating branch ruled out |
| MOND | 0 | $-1$ | Non-relativistic; no clock effect |
| Horndeski (generic) | 0 | 0 | Too broad; no specific predictions |
Within this comparison battery, TEP matches 8/8 JWST anomaly predictions compared to 1/8 for the next-best theory ($f(R)$). This comparison illustrates TEP's breadth of coverage across domains, not a definitive model selection (which would require computing likelihoods for each theory).
C.3.2 Theoretical Consistency Tests
| Test | Result | Status |
|---|---|---|
| Causality Constraint | 0/2000 causal violations | ✓ Pass |
| $\alpha_0$ Error Budget | $\Gamma_t$ uncertainty $\pm 4.2\%$ ($\sigma_{\alpha_0} = 0.05$) | ✓ Pass |
| Time-Space Decoupling | Temporal/spatial ratio > 1.5× | ✓ Pass |
| Multi-Tracer Consistency | Removed: this test used hardcoded synthetic α values, not measured data; result pending real multi-tracer calibration | — N/A |
| Screening Length Scale | λ_C = 2.5 kpc vs observed 1.5 kpc | ✓ Pass |
| Screening Transition | Transition spans 2.2 dex; cosmic mean fully unscreened | ✓ Pass |
| Precision Gravity (Cassini, LLR, pulsars, BBN, CMB) | All satisfied via chameleon screening | ✓ Pass |
C.3.3 Model Discrimination Tests
| Test | Result | Conclusion |
|---|---|---|
| IMF Constraint (Red Monsters) | Required slope: $\alpha = 1.5$ (no TEP) vs $2.1$ (with TEP) | TEP removes need for extreme IMF |
| IMF vs TEP Discrimination | Estimated simulation power 100% (5000/5000 trials) | TEP distinguishable from IMF |
| Mass-Proxy Breaker | 2/3 tests pass; z>8 LOWESS $\rho = 0.241$, partial-rank $\rho = 0.262$, shuffled-mass $Z = 9.2$ | $\Gamma_t$ carries information beyond mass+$z$ ordering |
| Dust Model Comparison | TEP 46.5× vs SN-only 10× | Dust physics favors TEP |
| Cross-Survey Systematics | $\rho$ scatter = 0.024 across surveys | Consistent across survey analyses |
| AGN Discrimination Power | Estimated simulation power 100% (2000/2000 trials) | TEP distinguishable from AGN |
| LRD Case Study | Mean $\Delta\Gamma_t = 1.73$, all compact | Consistent with overmassive BHs |
| Hubble Connection | 42% of tension explained ($\chi^2 = 36.8$) | Partial match; not fully consistent |
| Prediction Error Budget | $\pm 16.5\%$ combined uncertainty | Falsifiable at $2\sigma$ |
IMF constraint: Without TEP, the Red Monster SFE anomaly requires a top-heavy IMF ($\alpha_{\rm min} = 1.5$). With TEP, the corrected SFE requires only $\alpha_{\rm min} = 2.1$, consistent with standard IMFs. TEP and IMF produce observationally distinct signatures via environment-dependence (estimated simulation discrimination power 100%).
Mass proxy and systematics: The canonical robustness package no longer relies on the retired selection-function Monte Carlo branch. Instead, the live evidence comes from the mass-proxy breaker and cross-survey systematics checks: at $z > 8$, non-parametric residual tests retain the dust–$\Gamma_t$ signal after mass+$z$ removal (LOWESS $\rho = 0.241$, $p = 4.2 \times 10^{-5}$; partial-rank $\rho = 0.262$, $p = 8.1 \times 10^{-6}$), and shuffled-mass null tests show the observed correlation exceeds mass-ordering expectations by $9.2\sigma$, with 99.6% of the signal attributable to the non-linear $\Gamma_t$ form rather than to mass ordering alone. Cross-survey dust correlations show low scatter ($\rho_{\rm std} = 0.025$), arguing against survey-specific artifacts.
Hubble tension: TEP predicts an $H_0$ shift of 2.5 km/s/Mpc; the observed shift is 6.0 km/s/Mpc. TEP accounts for ~42% of the tension amplitude but is formally not consistent with the full tension ($\chi^2 = 36.8$, $p < 10^{-8}$).
Falsifiability: Combined prediction uncertainty is $\pm 16.5\%$. Key falsification criteria at $2\sigma$: mass-dust slope differing from 0.56 by $> 0.30$; SFE correction differing from 100% by $> 25\%$.
C.4 Black Hole Growth: Detailed Analysis
C.4.1 Error Propagation on the Boost Factor
Monte Carlo propagation yields a boost factor of $6 \times 10^5$ (68% CI: $8 \times 10^4$ to $5 \times 10^6$). Even at the lower bound, the boost is sufficient to grow a $10^2 M_\odot$ seed to $10^6 M_\odot$ within 600 Myr.
| Parameter | Central Value | Uncertainty | Contribution to $\sigma_{\rm Boost}$ |
|---|---|---|---|
| $\alpha_0$ | 0.55 | $\pm 0.16$ (28%) | Dominant |
| $\Delta\log M_h$ (core-halo) | 1.5 | $\pm 0.3$ (20%) | Secondary |
| $t_{\rm cosmic}(z=8)$ | 600 Myr | $\pm 20$ Myr (3%) | Negligible |
| $t_{\rm Salpeter}$ | 45 Myr | $\pm 5$ Myr (11%) | Minor |
C.4.2 Sensitivity Analysis: Boost vs. Compactness
The differential shear mechanism relies on the galaxy having a compact core. Runaway growth (Boost $> 100$) requires $r_e \lesssim 800$ pc. Extended galaxies ($r_e > 1$ kpc) lie in the regime of negligible differential shear, explaining why "Red Monsters" (extended) do not host overmassive black holes while "Little Red Dots" (compact) do.
C.4.3 Population-Level Validation: 260 Little Red Dots
Across 260 Kokorev-catalog LRDs with measured photometric redshifts and compactness estimates, the differential temporal shear calculation yields a broad distribution of growth boosts. A substantial majority exceed $10^3$, but not all objects do so; the mechanism therefore appears common rather than universal in the present real-data catalog.
| Quantity | Value | Interpretation |
|---|---|---|
| Redshift range | $4.02$–$8.93$ | Full LRD epoch |
| Median $\Delta\Gamma$ | $0.328$ | Differential shear |
| Median $\log_{10}$(Boost) | $3.24$ | $\sim 1{,}700\times$ growth |
| Fraction with Boost $> 10^3$ | $70\%$ | Majority show runaway |
C.5 External-Regime Discriminants
C.5.1 High-Value Observational Discriminants
Beyond the present empirical core, the most informative discriminants are observables that isolate the unique signatures of the scalar-field coupling. Key discriminants include:
- Spectroscopic Ages: Deep NIRSpec MSA spectroscopy of high-$\Gamma_t$ candidates at $z > 6$ to measure direct Balmer absorption ages, testing the predicted age enhancement independent of photometrically derived mass-to-light ratios.
- Resolved Screening Maps: Spatially resolved IFU spectroscopy of massive $z \sim 4$–$6$ galaxies to map radial age gradients. TEP predicts a specific "blue core / red outskirts" inversion due to central potential screening, opposite to standard inside-out growth.
- Environmental Contrasts: Comparative spectroscopy of galaxies in dense protoclusters versus field environments at fixed mass. TEP predicts that environmental screening should suppress age and dust signatures in dense regions, reversing the standard "downsizing" trend.
- Rest-Frame Mid-IR: MIRI imaging to directly probe the dust continuum at $z > 7$, confirming that the observed UV-reddening is driven by dust grains (as predicted by the $\Gamma_t$-accelerated production) rather than exotic stellar populations.
C.5.2 Wide-Field Regime Predictions
TEP also makes quantitative predictions in large wide-field survey regimes:
- Euclid-wide regime ($15{,}000$ deg²): $N \sim 300{,}000$ massive galaxies at $z = 0.9$–$1.8$; TEP predicts $\sim 25\%$ mass-dependent age offset.
- Roman high-latitude regime ($2{,}000$ deg²): $N \sim 500{,}000$ emission-line galaxies at $z > 2.5$; TEP predicts weak gas-phase metallicity–$\Gamma_t$ correlation.
- Roman supernova regime ($N \sim 2{,}700$): TEP predicts SN Ia rates in massive hosts enhanced by $1.28\times$.
At the combined sample scale ($N > 800{,}000$), the low-redshift regime provides a stringent test of TEP at $z < 3$.
C.5.3 Cosmic Variance Budget
Current survey fields: UNCOVER (Abell 2744, 45 arcmin², $\sigma_{\rm cv} \approx 22\%$), CEERS (EGS, 100 arcmin², $\sigma_{\rm cv} \approx 15\%$), COSMOS-Web (COSMOS, 1800 arcmin², $\sigma_{\rm cv} \approx 3.5\%$). Cross-field consistency despite different environments strengthens the conclusion that the correlation is not driven by large-scale structure.