Remote Execution (Dask) — Experimental¶
A remote execution framework that processes large simulation outputs on HPC compute nodes and returns only plot images to the login node (your Jupyter).
How It Works¶
Login node (Jupyter) Compute node (SLURM worker)
emout server start → Scheduler + Worker start
↕ InfiniBand high-speed comm
rdata = emout.Emout("dir").remote()
with remote_scope():
with remote_figure():
rdata.phisp[-1,:,100,:].plot() → HDF5 load + rendering on server
plt.xlabel("custom") → (recorded as commands)
← only PNG bytes (~50 KB)
with remote_figure(savefilepath="figure.png"):
rdata.phisp[-1,:,100,:].plot() → render on server
← save the image to a file
data = emout.Emout("dir")
data.phisp[-1,:,100,:].plot() → HDF5 load → 2D slice → transfer
← small array (few KB)
plt.xlabel("custom") ← local matplotlib rendering
Setup¶
On Python 3.10+, pip install emout automatically includes Dask and the
TLS dependency used by emout server. No extra setup is needed.
1. Start the server (once, in a terminal)¶
emout server start --partition gr20001a --memory 60G --walltime 03:00:00
The InfiniBand IP is auto-detected. emout also generates per-user TLS
credentials automatically, stores them with user-only permissions, and
mirrors the active session to ~/.emout/server.json.
Session: default
Scheduler running at tls://10.10.64.2:8786
Detected IP: 10.10.64.2
Workers: 1
By default, only one active server session is allowed per user. To run an additional session intentionally, give it a name:
emout server start --allow-multiple --name batch2 --memory 120G
emout server status --all
emout server stop --name batch2
2. Use from scripts¶
For new code, start from Emout.remote() plus remote_scope().
That is the most natural workflow for worker-side reuse and cleanup:
import emout
from emout.distributed import remote_figure, remote_scope
rdata = emout.Emout("output_dir").remote()
with remote_scope():
ymid = int(rdata.inp.ny // 2)
with remote_figure():
rdata.phisp[-1, :, ymid, :].plot()
If the active session is saved, existing code still works through the compatibility mode. The compat mode only follows the active/default session. For additional named sessions, connect explicitly. The compat workflow is summarized later under “Data-transfer mode”.
3. Stop the server¶
emout server stop
Additional named sessions can be stopped with emout server stop --name <session>
or all at once with emout server stop --all.
If a worker job is cancelled with scancel or disappears after walltime
timeout, the next emout server start / auto-connect treats that session
as stale and clears the saved state automatically. Remote execution fails
fast instead of waiting forever: compatibility mode falls back to local
execution, while explicit remote usage asks you to restart the server.
Usage Modes¶
Recommended mode (Emout.remote() + remote_scope())¶
This keeps worker-side objects alive as RemoteRef proxies while letting
you write code close to normal emout / numpy style. Expressions such as
-ref, ref1 + ref2, np.abs(ref), and int(ref) stay remote until you
explicitly fetch them.
import matplotlib.pyplot as plt
import emout
from emout.distributed import remote_figure, remote_scope
rdata = emout.Emout("output_dir").remote()
with remote_scope():
ymid = int(rdata.inp.ny // 2)
with remote_figure():
plt.figure(figsize=(18, 16))
rdata.phisp[-1, 180:400, ymid, :].plot()
(-rdata.exz[-1, 180:400, ymid, :]).plot()
plt.title("remote expression example")
Objects created inside remote_scope() are automatically drop()-ed when
the context exits, so you can reuse intermediate remote results many times
within the block without having to manage worker-side cleanup yourself.
open() / close() — explicit form for Jupyter¶
If you do not want to indent a whole cell under with, call open()
and close() directly. The scope survives across cells, so you can
keep rdata and every registered ref alive for as long as you need:
from emout.distributed import remote_scope
scope = remote_scope()
scope.open()
rdata = data.remote()
ref = rdata.phisp[-1, :, 100, :]
ref.plot()
# ...continue working in other cells with rdata / ref...
scope.close() # drops every registered ref in one go
close() is idempotent, so you rarely need a try/finally — calling
it twice is a no-op.
clear() — manual GC while the scope stays open¶
In loops that create many intermediate refs, clear() drops every
registered ref without leaving the scope:
scope = remote_scope()
scope.open()
rdata = data.remote()
for t in range(100):
ref = rdata.phisp[t, :, 100, :]
arr = ref.fetch()
# ... work with arr ...
scope.clear() # release this iteration's refs, keep the scope
scope.close()
After clear() the scope is still active, so refs created afterwards
continue to be tracked by the same scope. This is the tool of choice
for long-running sessions where you would otherwise see worker memory
grow monotonically.
Nesting scopes¶
remote_scope behaves like a stack. You can open an outer scope and
create another one inside it; every new ref is registered to the
innermost active scope, so closing the inner scope drops only its
refs and leaves the outer scope running:
# open/open/close/close
scope1 = remote_scope()
scope1.open()
scope2 = remote_scope()
scope2.open()
ref_inner = rdata.phisp[-1, :, 100, :] # tracked by scope2
scope2.close() # drops ref_inner only
ref_outer = rdata.exz[-1] # tracked by scope1
scope1.close() # drops ref_outer
Mixing explicit open() with a with block nests cleanly too:
scope1 = remote_scope()
scope1.open()
with remote_scope() as scope2:
ref_inner = rdata.phisp[-1, :, 100, :] # tracked by scope2
# scope2 auto-drops here; scope1 is still open
ref_outer = rdata.exz[-1] # tracked by scope1
scope1.close()
Foot-gun: never use the same scope instance with both
open()and awithblock. The snippet below looks fine but breaks —with scope:calls__exit__on the instance, soscopeis already closed by the time the block returns. Subsequent refs are tracked by nothing, andscope.close()becomes a no-op:scope = remote_scope() scope.open() with scope: # ← do NOT hand the same scope to `with` ref = rdata.phisp[-1] # scope is already closed here leaked = rdata.phisp[-2] # ← not tracked by any scope! scope.close() # ← no-opWhen you need both styles, open a new
remote_scope()inside the outer one (see the example above withwith remote_scope() as scope2:).
Image mode (remote_figure)¶
All matplotlib operations run on the server; only PNG bytes come back. Use when you want minimal local memory usage.
from emout.distributed import remote_figure
with remote_figure():
data.phisp[-1, :, 100, :].plot()
plt.axhline(y=50, color="red") # ← runs on server
plt.xlabel("x [m]")
plt.title("Custom title")
# ← PNG displayed in Jupyter here
Receiving a FigureProxy via as fig¶
remote_figure(...) yields a FigureProxy bound to the Figure that
will be constructed on the worker, so you can grab it with as fig and
call fig.add_axes(...) directly. This skips plt.figure() entirely
and is convenient when you need a multi-axes layout — for example, a 3D
plot with a dedicated colorbar axes:
with remote_figure(figsize=(13, 6), dpi=300) as fig:
ax = fig.add_axes([0.13, 0.11, 0.57, 0.78], projection="3d")
cax = fig.add_axes([0.74, 0.12, 0.025, 0.76])
data.phisp[-1].plot_surfaces(ax=ax, surfaces=data.boundaries)
ax.view_init(elev=36, azim=-110)
plt.colorbar(cax=cax, label=r"$\phi$ (V)")
fig is bound to a FigureProxy even when figsize is omitted.
open() / close() style¶
When adding with blocks to existing code is cumbersome, use RemoteFigure
with explicit open() / close():
from emout.distributed import RemoteFigure
rf = RemoteFigure()
rf.open()
data.phisp[-1, :, 100, :].plot()
plt.xlabel("x [m]")
rf.close() # ← commands replayed on server, PNG displayed
RemoteFigure also works as a context manager (with RemoteFigure() as rf: ...).
Note: If you forget to call
close(), matplotlib stays monkey-patched and aResourceWarningis emitted at garbage collection.
Jupyter cell magic (%%remote_figure)¶
Register the magic once per session, then use %%remote_figure at the top
of any cell:
# Register (once)
%load_ext emout.distributed.remote_figure
# or: from emout.distributed import register_magics; register_magics()
%%remote_figure
data.phisp[-1, :, 100, :].plot()
plt.xlabel("x [m]")
Options can be passed on the magic line:
%%remote_figure --dpi 300 --fmt svg --figsize 12,6
data.phisp[-1, :, 100, :].plot()
Option |
Short |
Description |
Default |
|---|---|---|---|
|
|
Output resolution |
|
|
|
Image format ( |
|
|
|
matplotlib default |
|
|
Emout directory for session lookup |
auto |
Data-transfer mode (compatibility mode)¶
This is the compatibility mode for existing plot()-centric code.
The worker extracts the slice and transfers it locally; matplotlib runs
on the client. For new code, prefer Emout.remote() / remote_scope(),
and keep this mode mainly for low-friction migration of older scripts.
data.phisp[-1, :, 100, :].plot()
plt.axhline(y=50, color="red") # ← local matplotlib
plt.xlabel("x [m]")
plt.title("Custom title")
plt.savefig("output.png")
Only a 2D slice (KB–MB) is transferred; the full 3D array stays on the worker.
Backtrace integration¶
Heavy particle-backtrace computations run once on the server; the result stays in worker memory. Re-render with different visualisation parameters without recomputing.
# Computation (runs on server, result cached in worker memory)
result = data.backtrace.get_probabilities(
x, y, z, vx_range, vy_center, vz_range, ispec=0,
)
# Visualise repeatedly using the same result (no recomputation)
with remote_figure():
result.vxvz.plot(cmap="viridis")
plt.title("Velocity distribution (vx-vz)")
with remote_figure():
result.plot_energy_spectrum(scale="log")
plt.xlabel("Energy [eV]")
# Free worker memory when done
result.drop()
Both data.backtrace... and data.remote().backtrace... return the same
dedicated proxies (RemoteProbabilityResult / RemoteBacktraceResult).
Use the former when you want to keep existing code almost unchanged, and
the latter when you want one explicit-remote workflow across fields,
boundaries, and backtrace results:
with remote_scope():
rdata = data.remote()
bt = rdata.backtrace.get_backtrace(position, velocity, ispec=0)
result = rdata.backtrace.get_probabilities(
x, y, z, vx_range, vy_center, vz_range, ispec=0,
)
with remote_figure():
bt.tx.plot()
result.vxvz.plot(cmap="viridis")
For the backtrace API itself (BacktraceResult / MultiBacktraceResult /
ProbabilityResult, shorthand attribute access, axis lists), see the
dedicated backtrace guide.
Local customisation with fetch()¶
If you need full matplotlib control (e.g. custom annotations, shared colour bars),
use fetch() to pull the small result arrays back to the client:
heatmap = result.vxvz.fetch() # → local HeatmapData
fig, ax = plt.subplots()
heatmap.plot(ax=ax, cmap="plasma")
ax.axhline(y=0, color="red", linestyle="--")
ax.set_title("Custom annotation")
Boundary meshes¶
# Boundary shapes only (lightweight, always local)
data.boundaries.plot()
# Overlay on field (3D array slice-transferred from server)
data.phisp[-1].plot_surfaces(ax=ax, surfaces=data.boundaries)
ax.set_xlabel("x [m]")
Animations (gifplot)¶
gifplot() runs end-to-end on the worker as well: frame generation and
encoding stay on the worker, and only the inline HTML or GIF bytes come
back to the client.
rdata = emout.Emout("output_dir").remote()
with remote_scope():
rdata.phisp[:, 100, :, :].gifplot() # inline HTML
rdata.phisp[:, 100, :, :].gifplot(action="save", filename="out.gif") # shared FS path
gif = rdata.phisp[:, 100, :, :].gifplot(action="bytes") # raw bytes
See the “Remote execution” section of the animations guide for the full options.
Explicit connection¶
To connect manually instead of auto-connecting:
from emout.distributed import connect
client = connect() # active/default session
client = connect(name="batch2") # additional named session
client = connect("tls://10.10.64.2:8786", name="batch2") # explicit address + saved credentials
Environment variables¶
Variable |
Description |
Default |
|---|---|---|
|
Scheduler IP (overrides auto-detection) |
InfiniBand auto |
|
Scheduler port |
|
|
SLURM partition |
|
|
Worker cores |
|
|
Worker memory |
|
|
Job wall time |
|
Port selection¶
The scheduler port defaults to 10000 + (UID % 50000), so each user on
the same login node gets a different port automatically (e.g. UID 36291
→ port 46291). If that port is already in use, up to 20 consecutive
ports are probed until a free one is found. Set
EMOUT_DASK_SCHED_PORT to override.
Limitations¶
Python >= 3.10 with
daskanddistributedinstalled.All simulation directories must be accessible from the worker node (shared filesystem required).
Worker memory grows with each loaded Emout instance. For very large campaigns, call
result.drop()to free cached computation results.