Skip to content

How to generate synthetic curves with noise and fit with Poisson weights

Synthetic data generation is essential for:

  • Validating that a fitting pipeline recovers known parameters.
  • Benchmarking different model variants.
  • Studying the effect of noise on parameter uncertainty.

This guide shows how to generate a multi-peak synthetic TL curve, corrupt it with Poisson (shot) noise, and fit it back using Poisson-weighted residuals.


Step 1 — Generate a clean multi-peak curve

import numpy as np
import tldecpy as tl

# Temperature grid
T = np.linspace(300.0, 620.0, 500)   # kelvin, 500 points
beta = 1.0                            # K/s

# Ground-truth parameters for three peaks
TRUTH = [
    {"model": "fo_rq", "Im": 800.0,  "E": 1.10, "Tm": 400.0},
    {"model": "fo_rq", "Im": 2500.0, "E": 1.35, "Tm": 460.0},
    {"model": "go_kg", "Im": 1200.0, "E": 1.55, "Tm": 520.0, "b": 1.6},
]

# Evaluate each model and sum them
I_clean = np.zeros_like(T)
for p in TRUTH:
    fn = tl.get_model(p["model"])
    args = {k: v for k, v in p.items() if k != "model"}
    I_clean += fn(T, **args)

print(f"Peak intensity : {I_clean.max():.1f}")
print(f"Baseline       : {I_clean.min():.3f}")

Step 2 — Add Poisson noise

Photon-counting detectors produce noise whose variance equals the count rate: \(\text{Var}(I_i) = I_i\), so \(\sigma_i = \sqrt{I_i}\).

rng = np.random.default_rng(0)

# Poisson noise: draw integer counts then convert back to float
I_poisson = rng.poisson(np.maximum(I_clean, 0)).astype(float)

# Signal-to-noise ratio at the main peak
snr = I_clean.max() / np.sqrt(I_clean.max())
print(f"SNR at apex: {snr:.1f}")

Gaussian approximation

For high count rates (\(I_i > 20\)) the Poisson distribution is well approximated by \(\mathcal{N}(I_i,\, I_i)\). At low count rates (< 20 counts/channel) use the exact Poisson draw shown above.


Step 3 — Compare fitting with and without Poisson weights

# Build PeakSpec list from ground truth (slightly offset init)
peaks = [
    tl.PeakSpec(
        name=f"P{i+1}",
        model=p["model"],
        init={k: v * 0.95 for k, v in p.items() if k != "model"},
    )
    for i, p in enumerate(TRUTH)
]

# --- Fit A: ordinary least squares (no weights) ---
result_ols = tl.fit_multi(
    T, I_poisson,
    peaks=peaks, bg=None, beta=beta,
    robust=tl.RobustOptions(loss="linear", weights="none"),
)

# --- Fit B: Poisson-weighted soft-l1 ---
result_poi = tl.fit_multi(
    T, I_poisson,
    peaks=peaks, bg=None, beta=beta,
    robust=tl.RobustOptions(
        loss="soft_l1",
        f_scale=np.sqrt(I_poisson.max()) * 0.5,
        weights="poisson",
    ),
)

print(f"OLS  — FOM: {result_ols.metrics.FOM:.3f} %   R²: {result_ols.metrics.R2:.6f}")
print(f"Poisson — FOM: {result_poi.metrics.FOM:.3f} %   R²: {result_poi.metrics.R2:.6f}")

With Poisson noise the OLS fit is dominated by the high-intensity apex, while the Poisson-weighted fit balances contributions from shoulders and apex.


Step 4 — Quantify parameter recovery

print(f"\n{'Peak':>4}  {'Param':>6}  {'Truth':>8}  {'OLS':>8}  {'Poisson':>8}")
print("-" * 42)
for i, p in enumerate(TRUTH):
    for key in ("Im", "E", "Tm"):
        truth_v = p.get(key, float("nan"))
        ols_v   = result_ols.peaks[i].params.get(key, float("nan"))
        poi_v   = result_poi.peaks[i].params.get(key, float("nan"))
        print(f"  P{i+1}  {key:>6}  {truth_v:>8.4g}  {ols_v:>8.4g}  {poi_v:>8.4g}")

Step 5 — Add Gaussian noise instead

For systems with dominant electronic readout noise (flat noise floor):

noise_sigma = 10.0   # counts — independent of intensity

rng2 = np.random.default_rng(1)
I_gaussian = I_clean + rng2.normal(0.0, noise_sigma, size=T.shape)
I_gaussian = np.clip(I_gaussian, 0.0, None)

result_gauss = tl.fit_multi(
    T, I_gaussian,
    peaks=peaks, bg=None, beta=beta,
    robust=tl.RobustOptions(
        loss="soft_l1",
        f_scale=noise_sigma * 2,
        weights="none",             # flat noise → no Poisson weighting
    ),
)
print(f"Gaussian noise fit FOM: {result_gauss.metrics.FOM:.3f} %")

Summary

Noise model Recommended loss Recommended weights f_scale guide
Photon counting (Poisson) "soft_l1" "poisson" ~0.5 × \(\sqrt{I_\max}\)
Electronic readout (Gaussian) "soft_l1" or "linear" "none" ~2 × \(\sigma_\text{noise}\)
Mixed "huber" "poisson" ~ noise floor
Clean synthetic data "linear" "none" (any)

Alternative: ODE-based simulation with tl.simulate

The approach above evaluates the analytical model expressions directly via tl.get_model(). For physically-grounded simulations that integrate the full trap-kinetics ODE system, use tl.simulate:

import tldecpy as tl
from tldecpy.simulate.fo import ode_fo, infer_n0_s

T0, T_end, beta = 300.0, 600.0, 1.0
E, Tm = 1.2, 450.0

n0, s = infer_n0_s(Im=1500.0, Tm=Tm, E=E, beta=beta)

ode_result = tl.simulate(
    ode_fo, params=(s, E),
    T0=T0, T_end=T_end, beta=beta,
    y0=[n0], state_keys=["n"],
    noise_config={"mode": "poisson", "seed": 0},
)

print(f"T shape : {ode_result.T.shape}")
print(f"I max   : {ode_result.I.max():.1f}")
print(f"States  : {list(ode_result.states.keys())}")

tl.simulate integrates the ODE using scipy.integrate.solve_ivp and returns a SimulationResult with T, I, states, and time. The ODE helper functions (ode_fo, infer_n0_s, etc.) live in tldecpy.simulate sub-modules and are not part of the public tl.* namespace.

Use tl.simulate when you need access to the trap-population trajectory n(T) or want to study the effect of different initial occupancies.