Step 5 — Configuration File Construction Logic¶
This notebook explains how to build SimulationConfig objects — from the simplest minimal case
to complex multi-prototype tissue architectures.
Config object hierarchy¶
SimulationConfig ← main config (required for SpotlessSimulator)
└── celltype_morphology ← dict[str, MorphologySpec] (optional)
└── default_morphology ← MorphologySpec (optional)
PrototypeSpecInstance ← one placed pattern element
└── spec: PrototypeSpec ← template (pattern, composition, morphology)
PrototypeSceneInstance ← higher-level: groups specs sharing a center
└── scene: PrototypeScene ← template for a whole micro-environment
└── specs: List[PrototypeSpec]
You can use either the low-level PrototypeSpecInstance list approach (full control)
or the higher-level PrototypeScene approach (reuse the same template at multiple positions).
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from STpuppeteer.simulation import SimulationConfig, SpotlessSimulator
from STpuppeteer.simulation.config import (
MorphologySpec,
PrototypeSpec,
PrototypeSpecInstance,
PrototypeScene,
PrototypeSceneInstance,
)
CT_PALETTE = {"ct_0": "#49997c", "ct_1": "#1ebecd", "ct_2": "#ae3918"}
PROTO_PALETTE = {0: "#cccccc", 1: "#e377c2", 2: "#17becf"}
plt.rcParams.update({"figure.dpi": 110, "axes.spines.top": False, "axes.spines.right": False})
print("Imports OK")
Imports OK
5.1 Minimal configuration¶
Only three fields are required — everything else uses sensible defaults.
minimal_config = SimulationConfig(
n_cells=300,
n_celltype=3,
n_genes=200,
)
print(minimal_config.summary())
print("\nDerived values:")
print(f" celltype_names : {minimal_config.celltype_names}")
print(f" n_housekeeping : {minimal_config.n_housekeeping_genes}")
print(f" default_morpho : {minimal_config.default_morphology}")
============================================================
STpuppeteer Simulation Configuration
============================================================
Random Seed: 42
Cell/Tissue Structure:
Cells: 300
Cell types: 3
Cell type distribution: {'ct_0': 0.4, 'ct_1': 0.3, 'ct_2': 0.3}
Spatial continuity: 0.2
Boundary fuzziness: 0.05
Default morphology: MorphologySpec(mean_area=24.0, cv_area=0.5, elongation=1.2, boundary_noise_apt=0.1, n_vertices=12, orientation_mode='random', orientation_noise=0.1, orientation_axis=None, expansion_ratio=1.8)
Gene Expression:
Total genes: 200
Marker genes per type: {'ct_0': 50, 'ct_1': 50, 'ct_2': 50}
Housekeeping genes: 50
Marker expression: µ=0.75, CV=0.6
Silence expression: µ=0.05, CV=1.0
Theta prior: α=2.0, rate=1.0
Transcript Behavior:
Leakage by celltype: {'ct_0': 0.1, 'ct_1': 0.1, 'ct_2': 0.1}
Leakage distance factor: 1.0
============================================================
Derived values:
celltype_names : ['ct_0', 'ct_1', 'ct_2']
n_housekeeping : 50
default_morpho : MorphologySpec(mean_area=24.0, cv_area=0.5, elongation=1.2, boundary_noise_apt=0.1, n_vertices=12, orientation_mode='random', orientation_noise=0.1, orientation_axis=None, expansion_ratio=1.8)
5.2 Per-cell-type morphology¶
MorphologySpec fields all default to None, meaning inherit from the default.
You only need to set the fields you want to override.
| Field | Description | Default |
|---|---|---|
mean_area |
Mean nucleus area (µm²) | 24.0 |
cv_area |
Area variability (CV) | 0.5 |
elongation |
Axis ratio (1 = round, >1 = elongated) | 1.2 |
orientation_mode |
"random", "radial", "tangential", "aligned" |
"random" |
orientation_noise |
Standard deviation in orientation (rad) | 0.1 |
boundary_noise_apt |
Outline irregularity [0, 1) | 0.1 |
n_vertices |
Polygon resolution | 12 |
expansion_ratio |
Cell radius / nucleus radius | 1.8 |
# Each cell type gets a distinct morphological identity
ct_morphology = {
"ct_0": MorphologySpec(
mean_area=30.0, elongation=1.1,
orientation_mode="random",
expansion_ratio=2.0, # large cytoplasm
),
"ct_1": MorphologySpec(
mean_area=18.0, elongation=1.8,
orientation_mode="radial", # elongated, pointing inward
expansion_ratio=1.5,
),
"ct_2": MorphologySpec(
mean_area=14.0, elongation=2.5,
orientation_mode="tangential", # stretched parallel to ring
expansion_ratio=1.2,
),
}
# Partial specs — fields not set inherit from default_morphology or hard-coded defaults
partial = MorphologySpec(mean_area=45.0) # only mean_area set
resolved = partial.resolve(default=MorphologySpec.default())
print(f"Partial spec: mean_area={partial.mean_area}, elongation={partial.elongation}")
print(f"Resolved spec: mean_area={resolved.mean_area}, elongation={resolved.elongation}")
print(f" orientation_mode={resolved.orientation_mode}")
Partial spec: mean_area=45.0, elongation=None
Resolved spec: mean_area=45.0, elongation=1.2
orientation_mode=random
5.3 Scalar vs list vs dict parameters¶
Several SimulationConfig parameters support three input formats:
- Scalar — same value for all cell types
- List — one value per cell type (in order ct_0, ct_1, …)
- Dict — explicit per-celltype mapping
# These three configurations are equivalent for n_celltype=3:
cfg_scalar = SimulationConfig(
n_cells=100, n_celltype=3, n_genes=200,
leakage_by_celltype=0.1, # scalar
)
cfg_list = SimulationConfig(
n_cells=100, n_celltype=3, n_genes=200,
leakage_by_celltype=[0.1, 0.1, 0.1], # list
)
cfg_dict = SimulationConfig(
n_cells=100, n_celltype=3, n_genes=200,
leakage_by_celltype={"ct_0": 0.1, "ct_1": 0.1, "ct_2": 0.1}, # dict
)
print("Scalar leakage:", cfg_scalar.leakage_by_celltype)
print("List leakage:", cfg_list.leakage_by_celltype)
print("Dict leakage:", cfg_dict.leakage_by_celltype)
# Per-celltype differentiation (dict only):
cfg_diff = SimulationConfig(
n_cells=100, n_celltype=3, n_genes=200,
leakage_by_celltype={"ct_0": 0.05, "ct_1": 0.20, "ct_2": 0.02},
)
print("\nDifferentiated leakage:", cfg_diff.leakage_by_celltype)
Scalar leakage: {'ct_0': 0.1, 'ct_1': 0.1, 'ct_2': 0.1}
List leakage: {'ct_0': 0.1, 'ct_1': 0.1, 'ct_2': 0.1}
Dict leakage: {'ct_0': 0.1, 'ct_1': 0.1, 'ct_2': 0.1}
Differentiated leakage: {'ct_0': 0.05, 'ct_1': 0.2, 'ct_2': 0.02}
5.4 Building a PrototypeSpec¶
A PrototypeSpec is a template — no position yet. It defines:
- pattern — which spatial primitive to use (
"cluster","ring","chain") - pattern_params — primitive-specific sizing parameters
- cell_type_composition — fraction of each cell type within this element (must sum to 1)
- morphology —
MorphologySpecfor nuclei in this element - n_cells — how many cells to place
# ── Tumour cluster spec ──────────────────────────────────────────────────────
tumour_spec = PrototypeSpec(
pattern="cluster",
pattern_params={
"radius": 30.0,
"density_profile": "Gaussian", # peaked centre
},
cell_type_composition={"ct_0": 0.85, "ct_1": 0.15},
morphology=MorphologySpec(
mean_area=40.0, elongation=1.2,
orientation_mode="random",
),
n_cells=55,
)
# ── Immune ring spec (surrounds the cluster) ──────────────────────────────────
immune_ring_spec = PrototypeSpec(
pattern="ring",
pattern_params={"inner_radius": 35.0, "outer_radius": 58.0},
cell_type_composition={"ct_1": 0.75, "ct_2": 0.25},
morphology=MorphologySpec(
mean_area=16.0, elongation=2.0,
orientation_mode="radial", # cells point toward cluster centre
),
n_cells=75,
)
print("tumour_spec :", tumour_spec)
print("\nimmune_ring_spec:", immune_ring_spec)
tumour_spec : PrototypeSpec(pattern='cluster', pattern_params={'radius': 30.0, 'density_profile': 'Gaussian'}, cell_type_composition={'ct_0': 0.85, 'ct_1': 0.15}, morphology=MorphologySpec(mean_area=40.0, cv_area=None, elongation=1.2, boundary_noise_apt=None, n_vertices=None, orientation_mode='random', orientation_noise=None, orientation_axis=None, expansion_ratio=None), n_cells=55, spatial_bias=None)
immune_ring_spec: PrototypeSpec(pattern='ring', pattern_params={'inner_radius': 35.0, 'outer_radius': 58.0}, cell_type_composition={'ct_1': 0.75, 'ct_2': 0.25}, morphology=MorphologySpec(mean_area=16.0, cv_area=None, elongation=2.0, boundary_noise_apt=None, n_vertices=None, orientation_mode='radial', orientation_noise=None, orientation_axis=None, expansion_ratio=None), n_cells=75, spatial_bias=None)
proto_instances = [
# Micro-environment 1 — upper-left
PrototypeSpecInstance(spec=tumour_spec, center=np.array([100.0, 250.0]),
scale=1.0, prototype_id=1, seed=10),
PrototypeSpecInstance(spec=immune_ring_spec, center=np.array([100.0, 250.0]),
scale=1.0, prototype_id=1, seed=11),
# Micro-environment 2 — lower-right (slightly larger via scale=1.3)
PrototypeSpecInstance(spec=tumour_spec, center=np.array([280.0, 120.0]),
scale=1.3, prototype_id=2, seed=20),
PrototypeSpecInstance(spec=immune_ring_spec, center=np.array([280.0, 120.0]),
scale=1.3, prototype_id=2, seed=21),
]
for p in proto_instances:
print(f" id={p.prototype_id} pattern={p.spec.pattern:8s} n_cells={p.spec.n_cells} "
f"scale={p.scale} center={p.center}")
id=1 pattern=cluster n_cells=55 scale=1.0 center=[100. 250.] id=1 pattern=ring n_cells=75 scale=1.0 center=[100. 250.] id=2 pattern=cluster n_cells=55 scale=1.3 center=[280. 120.] id=2 pattern=ring n_cells=75 scale=1.3 center=[280. 120.]
Approach B — PrototypeScene (reuse the same template)¶
When you want to scatter many copies of the same micro-environment across the tissue,
PrototypeScene.sample_instance(center) randomises scale and orientation automatically.
# Group specs into a reusable scene
tumour_scene = PrototypeScene(
name="tumour_with_immune_ring",
specs=[tumour_spec, immune_ring_spec],
scale_range=(0.7, 1.4), # each instance will be randomly scaled
orientation_range=(0, 2 * np.pi), # each instance will be randomly rotated
)
# Place it at three positions
centers = [np.array([100., 250.]), np.array([280., 120.]), np.array([200., 380.])]
scene_instances = [
tumour_scene.sample_instance(center=c, seed=i * 10)
for i, c in enumerate(centers)
]
# Convert to spec instances (the API that initialize_cells() accepts)
all_spec_instances = []
for i, si in enumerate(scene_instances):
si.prototype_id = i + 1 # assign unique prototype_id
all_spec_instances.extend(si.to_spec_instances())
print(f"Scene instances : {len(scene_instances)}")
print(f"Spec instances : {len(all_spec_instances)}")
for p in all_spec_instances:
print(f" id={p.prototype_id} pattern={p.spec.pattern:8s} "
f"scale={p.scale:.2f} center={np.round(p.center, 1)}")
Scene instances : 3 Spec instances : 6 id=1 pattern=cluster scale=1.15 center=[100. 250.] id=1 pattern=ring scale=1.15 center=[100. 250.] id=2 pattern=cluster scale=1.37 center=[280. 120.] id=2 pattern=ring scale=1.37 center=[280. 120.] id=3 pattern=cluster scale=0.90 center=[200. 380.] id=3 pattern=ring scale=0.90 center=[200. 380.]
5.6 Full configuration assembly and simulation¶
full_config = SimulationConfig(
seed=42,
canvas_size=450.0,
# ── Cell structure ────────────────────────────────────────────────────────
n_cells=1300,
n_celltype=3,
celltype_proportion=[0.5, 0.3, 0.2],
continuity=0.2,
fuzziness=0.05,
# ── Per-celltype morphology ──────────────────────────────────────────────
celltype_morphology=ct_morphology, # defined earlier in section 5.2
# ── Gene expression ──────────────────────────────────────────────────────
n_genes=150, n_markers=30,
marker_mu=0.75, marker_cv=0.6,
silence_mu=0.05, theta_alpha=2.0,
# ── Leakage (per celltype) ───────────────────────────────────────────────
leakage_by_celltype={"ct_0": 0.05, "ct_1": 0.15, "ct_2": 0.08},
leak_dist_factor=1.0,
)
print(full_config.summary())
============================================================
STpuppeteer Simulation Configuration
============================================================
Random Seed: 42
Cell/Tissue Structure:
Cells: 1300
Cell types: 3
Cell type distribution: {'ct_0': 0.5, 'ct_1': 0.3, 'ct_2': 0.2}
Spatial continuity: 0.2
Boundary fuzziness: 0.05
Default morphology: MorphologySpec(mean_area=24.0, cv_area=0.5, elongation=1.2, boundary_noise_apt=0.1, n_vertices=12, orientation_mode='random', orientation_noise=0.1, orientation_axis=None, expansion_ratio=1.8)
Gene Expression:
Total genes: 150
Marker genes per type: {'ct_0': 30, 'ct_1': 30, 'ct_2': 30}
Housekeeping genes: 60
Marker expression: µ=0.75, CV=0.6
Silence expression: µ=0.05, CV=1.0
Theta prior: α=2.0, rate=1.0
Transcript Behavior:
Leakage by celltype: {'ct_0': 0.05, 'ct_1': 0.15, 'ct_2': 0.08}
Leakage distance factor: 1.0
============================================================
sim = SpotlessSimulator(full_config)
sim.generate_gene_parameters()
sim.initialize_cells(prototype_instances=all_spec_instances)
cell_gdf = sim.cell_gdf
cell_gdf["_ct_color"] = cell_gdf["celltype"].map(CT_PALETTE)
cell_gdf["_proto_color"] = cell_gdf["prototype_id"].apply(
lambda x: PROTO_PALETTE.get(x, PROTO_PALETTE[0])
)
print(f"Total cells: {len(cell_gdf)}")
print(cell_gdf.groupby(["prototype_id", "celltype"]).size().rename("n").to_string())
fig, axs = plt.subplots(1, 2, figsize=(14, 7))
for ax, col, title, palette, col_key in [
(axs[0], "_ct_color", "Coloured by cell type", CT_PALETTE, "celltype"),
(axs[1], "_proto_color", "Coloured by prototype origin", PROTO_PALETTE, "prototype_id"),
]:
cell_gdf.set_geometry("cell_geometry").plot(
ax=ax, color=cell_gdf[col], alpha=0.3, edgecolor="k", linewidth=0.25)
cell_gdf.set_geometry("nucleus_geometry").plot(
ax=ax, color=cell_gdf[col], edgecolor="none", alpha=0.75)
ax.set_aspect("equal")
ax.set_title(title)
ax.set_xlabel("x (µm)")
axs[0].set_ylabel("y (µm)")
ct_handles = [mpatches.Patch(facecolor=c, label=k) for k, c in CT_PALETTE.items()]
pt_handles = [mpatches.Patch(facecolor=PROTO_PALETTE[k], label=f"proto {k}")
for k in sorted(PROTO_PALETTE)]
axs[0].legend(handles=ct_handles, title="Cell type")
axs[1].legend(handles=pt_handles, title="Origin")
fig.suptitle("Full config — 3 PrototypeScene instances + background", fontsize=12)
fig.tight_layout()
plt.show()
Total cells: 1300
prototype_id celltype
0 ct_0 415
ct_1 245
ct_2 165
1 ct_0 60
ct_1 76
ct_2 22
2 ct_0 92
ct_1 88
ct_2 26
3 ct_0 42
ct_1 59
ct_2 10
5.7 Config serialisation¶
config.to_dict() converts all parameters to a plain dictionary — useful for logging,
reproducibility, or passing config over a network.
import json
import dataclasses
cfg_dict = full_config.to_dict()
# Some fields (like MorphologySpec objects) need custom serialisation for JSON
def default_serializer(obj):
if dataclasses.is_dataclass(obj) and not isinstance(obj, type):
return dataclasses.asdict(obj)
return str(obj)
cfg_json = json.dumps(cfg_dict, default=default_serializer, indent=2)
print("Config as JSON (first 800 chars):")
print(cfg_json[:800], "...")
Config as JSON (first 800 chars):
{
"seed": 42,
"canvas_size": 450.0,
"n_cells": 1300,
"n_celltype": 3,
"celltype_proportion": {
"ct_0": 0.5,
"ct_1": 0.3,
"ct_2": 0.2
},
"continuity": 0.2,
"fuzziness": 0.05,
"celltype_morphology": {
"ct_0": {
"mean_area": 30.0,
"cv_area": null,
"elongation": 1.1,
"boundary_noise_apt": null,
"n_vertices": null,
"orientation_mode": "random",
"orientation_noise": null,
"orientation_axis": null,
"expansion_ratio": 2.0
},
"ct_1": {
"mean_area": 18.0,
"cv_area": null,
"elongation": 1.8,
"boundary_noise_apt": null,
"n_vertices": null,
"orientation_mode": "radial",
"orientation_noise": null,
"orientation_axis": null,
"expansion_ratio": 1.5
},
...
Summary — config construction checklist¶
| Step | Object | Required? |
|---|---|---|
| 1. Cell counts & types | SimulationConfig(n_cells, n_celltype, n_genes) |
Yes |
| 2. Spatial organisation | continuity, fuzziness |
Optional (defaults work) |
| 3. Per-type morphology | MorphologySpec → celltype_morphology dict |
Optional |
| 4. Leakage | leakage_by_celltype, leak_dist_factor |
Optional |
| 5. Prototype specs | PrototypeSpec (pattern, composition, morphology, n_cells) |
Optional |
| 6. Instantiate | PrototypeSpecInstance or PrototypeScene.sample_instance() |
Only with prototypes |
| 7. Run | SpotlessSimulator(config).initialize_cells(prototype_instances=…) |
Yes |
All simulations start and end with: SimulationConfig → SpotlessSimulator → results.