pathint_torch.PathIntegralSampler#
- class pathint_torch.PathIntegralSampler(get_log_mu, x_size, T, dt, method='euler', adaptive=False, rtol=1e-05, atol=0.0001, dt_min=1e-05, device=device(type='cpu'), dtype=torch.float32)[source]#
Bases:
objectClass defining loss and sampling functions for the path integral sampler.
This approach consists of a training objective and sampling procedure for optimal control of the stochastic process
\[\mathrm{d}\mathbf{x}_t = \mathbf{u}_t \mathrm{d}t + \mathrm{d}\mathbf{w}_t ,\]where \(\mathbf{w}_t\) is a Wiener process. A network trained to find the control policy \(\mathbf{u}_t(t, \mathbf{x})\) such that the loss function is minimized causes the above process to yield samples at time \(T\) with the prespecified distribution \(\mu(\cdot)\). (Distributions and quantities at time \(t=T\) are often referred to as “terminal”.) The procedure also yields importance sampling weights \(w\).
The undocumented attributes are keyword arguments passed to torchsde.sdeint.
Notes
As explained in the paper, the control policy network is trained by constructing an SDE augmented by the trajectory’s cost. This implementation uses a similar trick to simultaneously sample and compute importance sampling weights using any SDE solver.
- get_log_mu: Callable[[Tensor], Tensor][source]#
\(\log \mu(x)\), the log of the (unnormalized) terminal density to be sampled from.
- sample(model, batch_size, n_intermediate=0, entropy=None)[source]#
Generates a sample. To generate multiple samples, vmap over key.
- Parameters
model (
Module) – control policy network taking t and x as arguments.batch_size (
int) – batch size.n_intermediate (
int) – number of intermediate timesteps at which to save the trajectory.entropy (
Optional[int]) – seed for Brownian motion.
- Returns
sample paths at \(t = 0\), \(T\) and n_intermediate times in between. log_w: corresponding log importance sampling weights.
- Return type
xs
- sample_loss(model, batch_size, entropy=None)[source]#
Gets loss for a single trajectory.
- Parameters
model (
Module) – control policy network taking t and x as arguments.batch_size (
int) – batch size.n_intermediate – number of intermediate timesteps at which to save the trajectory.
entropy (
Optional[int]) – seed for Brownian motion.
- Returns
- approximation to \(\int_{t_0}^{t_1} \mathrm{d}t \frac{1}{2} \mathbf{u}_t(t, \mathbf{x}_t ; \theta) + \Psi(\mathbf{x}_T)\),
where the second term is the terminal cost specified by the training procedure.
- Return type
cost