# Remote Execution (Dask) — Experimental A remote execution framework that processes large simulation outputs on HPC compute nodes and returns only plot images to the login node (your Jupyter). ## How It Works ``` Login node (Jupyter) Compute node (SLURM worker) emout server start → Scheduler + Worker start ↕ InfiniBand high-speed comm rdata = emout.Emout("dir").remote() with remote_scope(): with remote_figure(): rdata.phisp[-1,:,100,:].plot() → HDF5 load + rendering on server plt.xlabel("custom") → (recorded as commands) ← only PNG bytes (~50 KB) with remote_figure(savefilepath="figure.png"): rdata.phisp[-1,:,100,:].plot() → render on server ← save the image to a file data = emout.Emout("dir") data.phisp[-1,:,100,:].plot() → HDF5 load → 2D slice → transfer ← small array (few KB) plt.xlabel("custom") ← local matplotlib rendering ``` ### Shared Session Architecture A single `RemoteSession` Dask Actor manages all Emout instances on one worker. When you access data from different simulations, the session lazily loads each `Emout` instance on first use and caches it for subsequent calls. This means **results from different simulations can be freely mixed** in the same `remote_figure()` block: ```python from emout.distributed import remote_scope data_a = emout.Emout("/path/to/sim_a").remote() data_b = emout.Emout("/path/to/sim_b").remote() result_a = data_a.backtrace.get_probabilities(...) result_b = data_b.backtrace.get_probabilities(...) with remote_scope(): with remote_figure(figsize=(12, 5)): plt.subplot(1, 2, 1) data_a.phisp[-1, :, 100, :].plot() plt.title("Sim A: potential") plt.subplot(1, 2, 2) result_b.vxvz.plot(cmap="plasma") plt.title("Sim B: backtrace") ``` All commands are replayed on the same worker — no data is transferred to the client. If `savefilepath` is provided, the rendered image can be saved directly in CLI / batch workflows. When the path has an extension, the output format is inferred from it. ## Setup On Python 3.10+, `pip install emout` automatically includes Dask and the TLS dependency used by `emout server`. No extra setup is needed. ### 1. Start the server (once, in a terminal) ```bash emout server start --partition gr20001a --memory 60G --walltime 03:00:00 ``` The InfiniBand IP is auto-detected. `emout` also generates per-user TLS credentials automatically, stores them with user-only permissions, and mirrors the active session to `~/.emout/server.json`. ``` Session: default Scheduler running at tls://10.10.64.2:8786 Detected IP: 10.10.64.2 Workers: 1 ``` By default, only one active server session is allowed per user. To run an additional session intentionally, give it a name: ```bash emout server start --allow-multiple --name batch2 --memory 120G emout server status --all emout server stop --name batch2 ``` ### 2. Use from scripts For new code, start from `Emout.remote()` plus `remote_scope()`. That is the most natural workflow for worker-side reuse and cleanup: ```python import emout from emout.distributed import remote_figure, remote_scope rdata = emout.Emout("output_dir").remote() with remote_scope(): ymid = int(rdata.inp.ny // 2) with remote_figure(): rdata.phisp[-1, :, ymid, :].plot() ``` If the active session is saved, existing code still works through the compatibility mode. The compat mode only follows the active/default session. For additional named sessions, connect explicitly. The compat workflow is summarized later under "Data-transfer mode". ### 3. Stop the server ```bash emout server stop ``` Additional named sessions can be stopped with `emout server stop --name ` or all at once with `emout server stop --all`. If a worker job is cancelled with `scancel` or disappears after walltime timeout, the next `emout server start` / auto-connect treats that session as stale and clears the saved state automatically. Remote execution fails fast instead of waiting forever: compatibility mode falls back to local execution, while explicit remote usage asks you to restart the server. ## Usage Modes ### Recommended mode (`Emout.remote()` + `remote_scope()`) This keeps worker-side objects alive as `RemoteRef` proxies while letting you write code close to normal `emout` / `numpy` style. Expressions such as `-ref`, `ref1 + ref2`, `np.abs(ref)`, and `int(ref)` stay remote until you explicitly fetch them. ```python import matplotlib.pyplot as plt import emout from emout.distributed import remote_figure, remote_scope rdata = emout.Emout("output_dir").remote() with remote_scope(): ymid = int(rdata.inp.ny // 2) with remote_figure(): plt.figure(figsize=(18, 16)) rdata.phisp[-1, 180:400, ymid, :].plot() (-rdata.exz[-1, 180:400, ymid, :]).plot() plt.title("remote expression example") ``` Objects created inside `remote_scope()` are automatically `drop()`-ed when the context exits, so you can reuse intermediate remote results many times within the block without having to manage worker-side cleanup yourself. #### `open()` / `close()` — explicit form for Jupyter If you do not want to indent a whole cell under `with`, call `open()` and `close()` directly. The scope survives across cells, so you can keep `rdata` and every registered ref alive for as long as you need: ```python from emout.distributed import remote_scope scope = remote_scope() scope.open() rdata = data.remote() ref = rdata.phisp[-1, :, 100, :] ref.plot() # ...continue working in other cells with rdata / ref... scope.close() # drops every registered ref in one go ``` `close()` is idempotent, so you rarely need a `try/finally` — calling it twice is a no-op. #### `clear()` — manual GC while the scope stays open In loops that create many intermediate refs, `clear()` drops every registered ref **without** leaving the scope: ```python scope = remote_scope() scope.open() rdata = data.remote() for t in range(100): ref = rdata.phisp[t, :, 100, :] arr = ref.fetch() # ... work with arr ... scope.clear() # release this iteration's refs, keep the scope scope.close() ``` After `clear()` the scope is still active, so refs created afterwards continue to be tracked by the same scope. This is the tool of choice for long-running sessions where you would otherwise see worker memory grow monotonically. #### Nesting scopes `remote_scope` behaves like a stack. You can open an outer scope and create another one inside it; every new ref is registered to **the innermost active scope**, so closing the inner scope drops only its refs and leaves the outer scope running: ```python # open/open/close/close scope1 = remote_scope() scope1.open() scope2 = remote_scope() scope2.open() ref_inner = rdata.phisp[-1, :, 100, :] # tracked by scope2 scope2.close() # drops ref_inner only ref_outer = rdata.exz[-1] # tracked by scope1 scope1.close() # drops ref_outer ``` Mixing explicit `open()` with a ``with`` block nests cleanly too: ```python scope1 = remote_scope() scope1.open() with remote_scope() as scope2: ref_inner = rdata.phisp[-1, :, 100, :] # tracked by scope2 # scope2 auto-drops here; scope1 is still open ref_outer = rdata.exz[-1] # tracked by scope1 scope1.close() ``` > **Foot-gun: never use the same scope instance with both ``open()`` and > a ``with`` block.** The snippet below looks fine but breaks — ``with > scope:`` calls ``__exit__`` on the instance, so ``scope`` is already > closed by the time the block returns. Subsequent refs are tracked by > **nothing**, and ``scope.close()`` becomes a no-op: > > ```python > scope = remote_scope() > scope.open() > with scope: # ← do NOT hand the same scope to `with` > ref = rdata.phisp[-1] > # scope is already closed here > leaked = rdata.phisp[-2] # ← not tracked by any scope! > scope.close() # ← no-op > ``` > > When you need both styles, open **a new** ``remote_scope()`` inside > the outer one (see the example above with ``with remote_scope() as scope2:``). ### Image mode (`remote_figure`) **All matplotlib operations run on the server**; only PNG bytes come back. Use when you want minimal local memory usage. ```python from emout.distributed import remote_figure with remote_figure(): data.phisp[-1, :, 100, :].plot() plt.axhline(y=50, color="red") # ← runs on server plt.xlabel("x [m]") plt.title("Custom title") # ← PNG displayed in Jupyter here ``` #### Receiving a FigureProxy via `as fig` `remote_figure(...)` yields a `FigureProxy` bound to the `Figure` that will be constructed on the worker, so you can grab it with `as fig` and call `fig.add_axes(...)` directly. This skips `plt.figure()` entirely and is convenient when you need a multi-axes layout — for example, a 3D plot with a dedicated colorbar axes: ```python with remote_figure(figsize=(13, 6), dpi=300) as fig: ax = fig.add_axes([0.13, 0.11, 0.57, 0.78], projection="3d") cax = fig.add_axes([0.74, 0.12, 0.025, 0.76]) data.phisp[-1].plot_surfaces(ax=ax, surfaces=data.boundaries) ax.view_init(elev=36, azim=-110) plt.colorbar(cax=cax, label=r"$\phi$ (V)") ``` `fig` is bound to a `FigureProxy` even when `figsize` is omitted. #### `open()` / `close()` style When adding `with` blocks to existing code is cumbersome, use `RemoteFigure` with explicit `open()` / `close()`: ```python from emout.distributed import RemoteFigure rf = RemoteFigure() rf.open() data.phisp[-1, :, 100, :].plot() plt.xlabel("x [m]") rf.close() # ← commands replayed on server, PNG displayed ``` `RemoteFigure` also works as a context manager (`with RemoteFigure() as rf: ...`). > **Note:** If you forget to call `close()`, matplotlib stays monkey-patched > and a `ResourceWarning` is emitted at garbage collection. #### Jupyter cell magic (`%%remote_figure`) Register the magic once per session, then use `%%remote_figure` at the top of any cell: ```python # Register (once) %load_ext emout.distributed.remote_figure # or: from emout.distributed import register_magics; register_magics() ``` ```python %%remote_figure data.phisp[-1, :, 100, :].plot() plt.xlabel("x [m]") ``` Options can be passed on the magic line: ```python %%remote_figure --dpi 300 --fmt svg --figsize 12,6 data.phisp[-1, :, 100, :].plot() ``` | Option | Short | Description | Default | | --- | --- | --- | --- | | `--dpi` | `-d` | Output resolution | `150` | | `--fmt` | `-f` | Image format (`png`, `svg`, …) | `png` | | `--figsize` | | `width,height` | matplotlib default | | `--emout-dir` | | Emout directory for session lookup | auto | ### Data-transfer mode (compatibility mode) This is the compatibility mode for existing `plot()`-centric code. The worker extracts the slice and transfers it locally; matplotlib runs on the client. **For new code, prefer `Emout.remote()` / `remote_scope()`**, and keep this mode mainly for low-friction migration of older scripts. ```python data.phisp[-1, :, 100, :].plot() plt.axhline(y=50, color="red") # ← local matplotlib plt.xlabel("x [m]") plt.title("Custom title") plt.savefig("output.png") ``` Only a 2D slice (KB–MB) is transferred; the full 3D array stays on the worker. ### Backtrace integration Heavy particle-backtrace computations run once on the server; the result stays in worker memory. Re-render with different visualisation parameters without recomputing. ```python # Computation (runs on server, result cached in worker memory) result = data.backtrace.get_probabilities( x, y, z, vx_range, vy_center, vz_range, ispec=0, ) # Visualise repeatedly using the same result (no recomputation) with remote_figure(): result.vxvz.plot(cmap="viridis") plt.title("Velocity distribution (vx-vz)") with remote_figure(): result.plot_energy_spectrum(scale="log") plt.xlabel("Energy [eV]") # Free worker memory when done result.drop() ``` Both `data.backtrace...` and `data.remote().backtrace...` return the same dedicated proxies (`RemoteProbabilityResult` / `RemoteBacktraceResult`). Use the former when you want to keep existing code almost unchanged, and the latter when you want one explicit-remote workflow across fields, boundaries, and backtrace results: ```python with remote_scope(): rdata = data.remote() bt = rdata.backtrace.get_backtrace(position, velocity, ispec=0) result = rdata.backtrace.get_probabilities( x, y, z, vx_range, vy_center, vz_range, ispec=0, ) with remote_figure(): bt.tx.plot() result.vxvz.plot(cmap="viridis") ``` For the backtrace API itself (`BacktraceResult` / `MultiBacktraceResult` / `ProbabilityResult`, shorthand attribute access, axis lists), see the dedicated [backtrace guide](backtrace.md). #### Local customisation with fetch() If you need full matplotlib control (e.g. custom annotations, shared colour bars), use `fetch()` to pull the small result arrays back to the client: ```python heatmap = result.vxvz.fetch() # → local HeatmapData fig, ax = plt.subplots() heatmap.plot(ax=ax, cmap="plasma") ax.axhline(y=0, color="red", linestyle="--") ax.set_title("Custom annotation") ``` ### Boundary meshes ```python # Boundary shapes only (lightweight, always local) data.boundaries.plot() # Overlay on field (3D array slice-transferred from server) data.phisp[-1].plot_surfaces(ax=ax, surfaces=data.boundaries) ax.set_xlabel("x [m]") ``` ### Animations (`gifplot`) `gifplot()` runs end-to-end on the worker as well: frame generation and encoding stay on the worker, and only the inline HTML or GIF bytes come back to the client. ```python rdata = emout.Emout("output_dir").remote() with remote_scope(): rdata.phisp[:, 100, :, :].gifplot() # inline HTML rdata.phisp[:, 100, :, :].gifplot(action="save", filename="out.gif") # shared FS path gif = rdata.phisp[:, 100, :, :].gifplot(action="bytes") # raw bytes ``` See the "Remote execution" section of the [animations guide](animation.md) for the full options. ## Explicit connection To connect manually instead of auto-connecting: ```python from emout.distributed import connect client = connect() # active/default session client = connect(name="batch2") # additional named session client = connect("tls://10.10.64.2:8786", name="batch2") # explicit address + saved credentials ``` ## Environment variables | Variable | Description | Default | | --- | --- | --- | | `EMOUT_DASK_SCHED_IP` | Scheduler IP (overrides auto-detection) | InfiniBand auto | | `EMOUT_DASK_SCHED_PORT` | Scheduler port | `10000 + (UID % 50000)` | | `EMOUT_DASK_PARTITION` | SLURM partition | `gr20001a` | | `EMOUT_DASK_CORES` | Worker cores | `60` | | `EMOUT_DASK_MEMORY` | Worker memory | `60G` | | `EMOUT_DASK_WALLTIME` | Job wall time | `03:00:00` | ### Port selection The scheduler port defaults to `10000 + (UID % 50000)`, so each user on the same login node gets a different port automatically (e.g. UID 36291 → port 46291). If that port is already in use, up to 20 consecutive ports are probed until a free one is found. Set `EMOUT_DASK_SCHED_PORT` to override. ## Limitations - Python >= 3.10 with `dask` and `distributed` installed. - All simulation directories must be accessible from the worker node (shared filesystem required). - Worker memory grows with each loaded Emout instance. For very large campaigns, call `result.drop()` to free cached computation results.