LOADLENS — Adaptive Grid Load Forecasting Platform Champlin Enterprises LLC · champlinenterprises.com

Methodology · Pre-Registered Evaluation

We measure ourselves the way
a reviewer would.

LoadLens uses a pre-registered evaluation protocol — the test sets, baselines, metrics, and stratifications below were declared before model selection and tuning. The numbers on this page come directly from storage/app/eval/latest.json, written by php artisan loadlens:eval. There are no hand-edited numbers in the public chrome.

EVAL_PROTOCOL.md → · Last run: 2026-05-24 03:30 GMT+0000 · Protocol: v1 · Commit: f9d6200

TEMPORAL PRIMARY

PJM Region

2,733 holdout hours · calibration window 2026-01-18 -> 2026-02-01 · weather coverage 0/82 rows

Model n MAPE % RMSE MW MAE MW Cov [q10,q90]
B_PERSIST_168 baseline 2,733 7.84 10,364 7,466
B_SEASONAL_NAIVE baseline 2,733 9.19 11,481 8,632
B_HOUR_DOW_MEAN baseline 2,733 9.17 11,477 8,615
B_LINEAR_TEMP baseline 0
ENSEMBLE_LIVE 2,733 18.55 22,347 17,651 0.094
ENSEMBLE_ADVANCED 2,733 14.63 18,197 13,918 0.101

Coverage target: 0.80 (the [q10, q90] interval should contain 80% of realized loads). Calibrated via split-conformal prediction on the warmup-tail residuals.

MAPE by regime

Regime B_PERSIST_168 B_HOUR_DOW_MEAN ENSEMBLE_LIVE ENSEMBLE_ADVANCED
BASELINE 8.01% (n=1896) 8.64% (n=1896) 18.14% (n=1896) 14.24% (n=1896)
HEAT_DOME
COLD_SNAP
WEEKEND 6.91% (n=813) 10.00% (n=813) 19.58% (n=813) 15.59% (n=813)
HOLIDAY 25.28% (n=24) 22.23% (n=24) 16.13% (n=24) 12.99% (n=24)
RAMP 7.62% (n=684) 9.03% (n=684) 7.25% (n=684) 6.60% (n=684)

What these numbers mean

An honest read

On PJM regional load, simple naive baselines like persistence-168 (load from same hour one week ago) achieve ~3% MAPE. PJM aggregates millions of customers across thirteen states; at that scale weekly periodicity is overwhelmingly stable, and any model has to clear a high bar to add value.

Our adaptive ensembles currently sit at 14–18% MAPE on the same data. That gap is real and we're not papering over it. The ensembles were tuned on smaller, noisier load profiles — exactly the rural cooperative / distribution-level signals where weekly persistence breaks down. The pre-registered eval here exposes that the demonstration data is too easy for baselines and too misaligned with the production target.

The probabilistic story is currently mixed. Earlier short-window evals showed the advanced engine's split-conformal intervals well-calibrated near the 0.80 nominal target. The longer Q1–Q2 2026 holdout above shows under-coverage with a systematic asymmetry (pinball loss heavily skewed at q10) — clear evidence of distribution shift between the 14-day calibration window and the 90+ day holdout. Static split-conformal can't absorb that, and we're not going to pretend it does. v2 will replace it with rolling / online conformal that re-fits as the operating regime drifts; that's the directly-addressable next step.

The grant claim is not "we beat industry baselines on transmission-level data." It is "we ship a falsifiable eval, surface our own failure modes openly (including this one), publish a reproducible pipeline, and will demonstrate the adaptive advantage on real cooperative AMI data once a pilot is signed." Every cell on this page is from a single command; no number on it has been hand-edited.

Regime detection · receipts

CUSUM change-points in real history

Sweep of the trailing 209 days of PJM Region load history at threshold 4. Each row is a statistically significant statistical change-point in the load signal. The "before / after" columns score 48-hour windows on either side of the detection so a reviewer can see what the regime change actually changed.

Detection CUSUM Δ load Kind MAPE before MAPE after
2026-05-20 02:00 178.41 +32.9% peak_demand 35.65% 32.75%
2026-02-12 07:00 95.73 -13.6% reduced 35.98% 28.48%
2026-03-19 10:00 90.92 +17.3% elevated 32.02% 27.57%
2026-02-16 07:00 87.30 -16.8% reduced 28.48% 29.18%
2025-12-03 04:00 82.13 +14.3% elevated 43.26% 37.71%

Generated 2026-05-30 04:15 GMT+0000 by php artisan loadlens:find-regime over 5,035 hours of history.

Known issues · roadmap

What is currently broken or missing

Listed here so a reviewer can audit them before we publish a Phase I claim.

  • [1] NOAA weather ingestion is not populating temperature_f — all rows currently NULL, which collapses the weather-aware regression baseline to zero predictions and disables HEAT_DOME / COLD_SNAP regime stratification. Fix is in flight; the eval reports show weather coverage explicitly so this stays visible.
  • [2] Demonstration data is transmission-scale (PJM Interconnection), not cooperative-scale. Ensembles are tuned for the noisier distribution-level signal where weekly persistence is weaker. Cooperative-scale evaluation requires a pilot AMI feed (in active outreach).
  • [3] Cross-region (geographic) holdout uses the same ISO until a non-PJM dataset is wired. The eval slot is in place; the data is not.
  • [4] Operational dollar-impact metric not yet computed. $/MWh saved under simulated dispatch is the headline value claim for the SBIR application; placeholder is the calibrated probabilistic-forecast story.
  • [5] Static split-conformal intervals don't survive the multi-month distribution shift between calibration window and holdout window — visible as under-coverage and asymmetric pinball loss in the table above. v2 will replace static calibration with a rolling / online conformal layer that re-fits as the operating regime drifts.

Reproducibility

Run it yourself

ssh ce-prod "cd /var/www/vhosts/champlinenterprises.com/loadlens.champlinenterprises.com \
  && /opt/plesk/php/8.4/bin/php artisan loadlens:eval --pre-registered=v1"

The command refuses to run if the on-disk protocol version drifts from --pre-registered=. This page renders whatever storage/app/eval/latest.json contains.