Vault workflow — produce, validate, designate, push#
A short, end-to-end counterpart to the Vacua Storage tutorial: take one class of vacua (ISD+) for a single model, validate them, designate them as permanent records, and push them to the HuggingFace vault. Designation and the HF directory layout are explained inline.
Step 0 — Setup#
Sandbox the vault and cache so the demo does not touch your real vault.
import os, tempfile, warnings
import numpy as np
import jax
jax.config.update("jax_enable_x64", True)
import jax.numpy as jnp
from scipy.optimize import root
import jaxvacua as jvc
import stringforge as sj
from stringforge import LCSDatabase
from stringforge import vacuavault as vv # server-side validation layer
warnings.filterwarnings("ignore")
tmpdir = tempfile.mkdtemp(prefix="vault_workflow_")
sj.set_vault_dir(os.path.join(tmpdir, "vault")) # must be set before designate
db = LCSDatabase.from_local(tmpdir)
print("stringforge :", sj.__version__)
print("vault dir :", os.environ["STRINGFORGE_VAULT"])
stringforge : 0.1.0
vault dir : /var/folders/p7/f3t072gd3rbfgb00y3hj7dwc0000gn/T/vault_workflow_wmtejz76/vault
Step 1 — Produce one class of vacua (ISD+)#
We use a single local jaxvacua model — a Kreuzer–Skarke geometry addressed
by h12 + model_ID — and solve the SUSY conditions \(D_I W = 0\) for fluxes
drawn in the ISD+ (imaginary-self-dual, “+” branch) mode. The converged
solutions are written as session vacua (Tier 1) via vacua_writer.
model_A = jvc.FluxVacuaFinder(h12=2, model_ID=1, model_type="KS", maximum_degree=2)
model_A.lcs_tree.a_matrix = jnp.array([[4.5, 1.5], [1.5, 0.0]]) # demo geometry
sampler = jvc.data_sampler(model_A, flux_bounds=[-5, 5])
def solve_from_guesses(model, z0, tau0, fluxes0, tol=1e-10, residual_tol=1e-6):
"""Solve D_I W = 0 for each (z, tau, flux) guess; keep converged SUSY vacua."""
moduli, taus, fluxes = [], [], []
for i in range(len(z0)):
x0 = model._convert_complex_to_real(
z0[i], jnp.conj(z0[i]), tau0[i], jnp.conj(tau0[i]))
r = root(model.DW_x, x0, args=(fluxes0[i],),
jac=model.dDW_x, method="hybr", tol=tol)
if float(np.max(np.abs(r.fun))) < residual_tol:
m, _, t, _ = model._convert_real_to_complex(r.x)
if not np.any(np.isnan(np.append(m, t))):
moduli.append(m); taus.append(t); fluxes.append(fluxes0[i])
return moduli, taus, fluxes
z0, tau0, f0 = sampler.initial_guesses_ISD(
N=40, Nmax=int(model_A.D3_tadpole), mode="ISD+",
moduli_sampling_mode="cone", ISD_oversample_factor=2,
print_progress=False, filter_moduli=True)
moduli, taus, fluxes = solve_from_guesses(model_A, z0, tau0, f0)
with db.vacua_writer(model=model_A, method="ISD+") as w:
w.append_batch(jnp.array(moduli), jnp.array(taus), jnp.array(fluxes),
is_susy=np.ones(len(moduli), dtype=bool))
run_id = w._run_id
print(f"ISD+: {len(moduli)}/{len(z0)} converged; stored {w.count} session vacua "
f"(run {run_id[:8]}...)")
Restricted license - for non-production use only - expires 2026-11-23
ISD+: 34/40 converged; stored 34 session vacua (run 5c3460e3...)
Step 2 — What designation means#
The writer above stored session vacua (Tier 1): fast, append-only, keyed by
a random run_id. Designation promotes a curated subset to permanent
records (Tier 2). designate_vacua does five things:
Validates each solution: \(|D_I W| <\)
F_term_toland schema conformance.Deduplicates against existing designated records by flux vector.
Computes derived quantities — superpotential
W, F-terms, tadpoleN_flux, string couplingg_s, and the mass spectrum (mass2,m_gravitino) — and stores them on the record.Records provenance:
label,committed_by, timestamp,notes,jaxvacua_version, and the sourcerun_id.Assigns a unique integer
designated_id.
Designated vacua live under STRINGFORGE_VAULT/ and survive clear_cache().
Mass units.
mass2andm_gravitinoare in the FluxEFT no-scale normalisation, not in \(M_\mathrm{Pl}\): the overall CY volume is not stabilised at this level, so both carry an unfixed volume factor. Only the ratio \(|m|/m_{3/2}\) (in which the volume cancels) is volume-independent.
# Validation can be run standalone (designate_vacua also runs it internally).
session = db.load_vacua(run_id=run_id)
report = db.validate_vacua(session, model=model_A)
n_pass = sum(1 for r in report if r["passed"])
print(f"validate_vacua: {n_pass}/{len(report)} rows pass F-term + schema checks")
# The server-side vacuavault layer applies the same checks to uploaded parquets
# via vv.validate_parquet_file(path, db=..., physics_checks="explicit").
validate_vacua: 34/34 rows pass F-term + schema checks
designated_ids = db.designate_vacua(
session.head(5),
label="ISDplus_demo",
committed_by="tutorial@example.com",
model=model_A,
notes="Vault-workflow tutorial: ISD+ class.",
)
print(f"Designated {len(designated_ids)} vacua -> designated_id = {designated_ids}")
# The designated records carry the derived quantities, incl. the mass spectrum.
designated = db.load_designated(h12=2)
cols = [c for c in ("designated_id", "is_susy", "g_s", "m_gravitino", "mass2")
if c in designated.columns]
print(designated[cols].head().to_string())
Designated 5 vacua -> designated_id = [0, 1, 2, 3, 4]
designated_id is_susy g_s m_gravitino mass2
0 0 True 0.210754 10.867611 [61.91154144887241, 61.91154144890059, 112.22550541257384, 112.22550541257682, 77779.30650471494, 77779.30650471509]
1 1 True 0.479169 7.212083 [6.425015231616918, 124.11413515460403, 141.35790210319186, 536.1416321558296, 653.5599049121203, 1412.1737268171164]
2 2 True 0.145370 5.081123 [8.05024347339861, 53.67241846783432, 83.20555920324412, 371.81098639263524, 732468.0310313946, 732468.0310313954]
3 3 True 0.118041 8.049246 [6.984435847901165, 24.963176652552953, 31.363983914866616, 123.25828461530197, 351.23630274468866, 470.83995172722484]
4 4 True 0.190678 2.155204 [0.21567492239794464, 14.79320220991362, 608.9696732271706, 840.7558080025154, 23081.02036721397, 23081.020367214023]
Step 3 — Pushing to the HuggingFace vault#
push_vacua_to_hub uploads vacua to the shared aschachner/vacua_vault
dataset as a pull request, under a per-model directory derived from the model’s
identity. The same call therefore works for two kinds of model:
A model from the HF CY-database (
LCSDatabase(dataset="tdf"), addressed byks_id+triang_id) →tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/….A local jaxvacua model (addressed by
h12+model_ID, likemodel_A) →local/h12_{h12}/model_{model_ID}/community/….
Both require HuggingFace credentials (huggingface-cli login or HF_TOKEN), so
the cells below upload only when a token is present.
# (a) Local model (h12 + model_ID) -> local/h12_2/model_1/community/...
if os.environ.get("HF_TOKEN"):
info = db.push_vacua_to_hub(
designated, label="ISDplus_demo",
committed_by="tutorial@example.com",
model=model_A, create_pr=True)
print("local push ->", info["file_path"], "| PR:", info.get("pr_url"))
else:
print("Set HF_TOKEN to push. The local model routes to:")
print(" local/h12_2/model_1/community/<user>_ISDplus_demo.parquet")
Set HF_TOKEN to push. The local model routes to:
local/h12_2/model_1/community/<user>_ISDplus_demo.parquet
# (b) A model already in the HF CY-database -> tdf/h12_2/ks_29_tri_1/community/...
if os.environ.get("HF_TOKEN"):
tdf_db = LCSDatabase(dataset="tdf")
tdf_model = tdf_db.load_model(ks_id=29, triang_id=1)
# Produce + designate vacua for tdf_model exactly as above, then:
# tdf_db.push_vacua_to_hub(vacua_df, label=..., committed_by=...,
# model=tdf_model, create_pr=True)
print("Loaded TDF model; push routes to tdf/h12_2/ks_29_tri_1/community/...")
else:
print("Set HF_TOKEN (and network) to load a cy-database model; it routes to:")
print(" tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/<user>_<label>.parquet")
Set HF_TOKEN (and network) to load a cy-database model; it routes to:
tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/<user>_<label>.parquet
Summary#
Step |
Call |
|---|---|
Produce |
|
Validate |
|
Designate |
|
Push |
|
The model’s identity decides the vault directory: tdf/… for CY-database
models, local/… for local (h12, model_ID) models, cicy/… for CICY.
import shutil
shutil.rmtree(tmpdir, ignore_errors=True)
sj.set_vault_dir(None) # clears the STRINGFORGE_VAULT override
print("cleaned up sandbox")
cleaned up sandbox