Vault workflow — produce, validate, designate, push#

A short, end-to-end counterpart to the Vacua Storage tutorial: take one class of vacua (ISD+) for a single model, validate them, designate them as permanent records, and push them to the HuggingFace vault. Designation and the HF directory layout are explained inline.

Step 0 — Setup#

Sandbox the vault and cache so the demo does not touch your real vault.

import os, tempfile, warnings
import numpy as np
import jax
jax.config.update("jax_enable_x64", True)
import jax.numpy as jnp
from scipy.optimize import root

import jaxvacua as jvc
import stringforge as sj
from stringforge import LCSDatabase
from stringforge import vacuavault as vv          # server-side validation layer

warnings.filterwarnings("ignore")

tmpdir = tempfile.mkdtemp(prefix="vault_workflow_")
sj.set_vault_dir(os.path.join(tmpdir, "vault"))   # must be set before designate
db = LCSDatabase.from_local(tmpdir)

print("stringforge :", sj.__version__)
print("vault dir   :", os.environ["STRINGFORGE_VAULT"])
stringforge : 0.1.0
vault dir   : /var/folders/p7/f3t072gd3rbfgb00y3hj7dwc0000gn/T/vault_workflow_wmtejz76/vault

Step 1 — Produce one class of vacua (ISD+)#

We use a single local jaxvacua model — a Kreuzer–Skarke geometry addressed by h12 + model_ID — and solve the SUSY conditions \(D_I W = 0\) for fluxes drawn in the ISD+ (imaginary-self-dual, “+” branch) mode. The converged solutions are written as session vacua (Tier 1) via vacua_writer.

model_A = jvc.FluxVacuaFinder(h12=2, model_ID=1, model_type="KS", maximum_degree=2)
model_A.lcs_tree.a_matrix = jnp.array([[4.5, 1.5], [1.5, 0.0]])   # demo geometry
sampler = jvc.data_sampler(model_A, flux_bounds=[-5, 5])

def solve_from_guesses(model, z0, tau0, fluxes0, tol=1e-10, residual_tol=1e-6):
    """Solve D_I W = 0 for each (z, tau, flux) guess; keep converged SUSY vacua."""
    moduli, taus, fluxes = [], [], []
    for i in range(len(z0)):
        x0 = model._convert_complex_to_real(
            z0[i], jnp.conj(z0[i]), tau0[i], jnp.conj(tau0[i]))
        r = root(model.DW_x, x0, args=(fluxes0[i],),
                 jac=model.dDW_x, method="hybr", tol=tol)
        if float(np.max(np.abs(r.fun))) < residual_tol:
            m, _, t, _ = model._convert_real_to_complex(r.x)
            if not np.any(np.isnan(np.append(m, t))):
                moduli.append(m); taus.append(t); fluxes.append(fluxes0[i])
    return moduli, taus, fluxes

z0, tau0, f0 = sampler.initial_guesses_ISD(
    N=40, Nmax=int(model_A.D3_tadpole), mode="ISD+",
    moduli_sampling_mode="cone", ISD_oversample_factor=2,
    print_progress=False, filter_moduli=True)
moduli, taus, fluxes = solve_from_guesses(model_A, z0, tau0, f0)

with db.vacua_writer(model=model_A, method="ISD+") as w:
    w.append_batch(jnp.array(moduli), jnp.array(taus), jnp.array(fluxes),
                   is_susy=np.ones(len(moduli), dtype=bool))
run_id = w._run_id
print(f"ISD+: {len(moduli)}/{len(z0)} converged; stored {w.count} session vacua "
      f"(run {run_id[:8]}...)")
Restricted license - for non-production use only - expires 2026-11-23
ISD+: 34/40 converged; stored 34 session vacua (run 5c3460e3...)

Step 2 — What designation means#

The writer above stored session vacua (Tier 1): fast, append-only, keyed by a random run_id. Designation promotes a curated subset to permanent records (Tier 2). designate_vacua does five things:

  1. Validates each solution: \(|D_I W| <\) F_term_tol and schema conformance.

  2. Deduplicates against existing designated records by flux vector.

  3. Computes derived quantities — superpotential W, F-terms, tadpole N_flux, string coupling g_s, and the mass spectrum (mass2, m_gravitino) — and stores them on the record.

  4. Records provenance: label, committed_by, timestamp, notes, jaxvacua_version, and the source run_id.

  5. Assigns a unique integer designated_id.

Designated vacua live under STRINGFORGE_VAULT/ and survive clear_cache().

Mass units. mass2 and m_gravitino are in the FluxEFT no-scale normalisation, not in \(M_\mathrm{Pl}\): the overall CY volume is not stabilised at this level, so both carry an unfixed volume factor. Only the ratio \(|m|/m_{3/2}\) (in which the volume cancels) is volume-independent.

# Validation can be run standalone (designate_vacua also runs it internally).
session = db.load_vacua(run_id=run_id)
report  = db.validate_vacua(session, model=model_A)
n_pass  = sum(1 for r in report if r["passed"])
print(f"validate_vacua: {n_pass}/{len(report)} rows pass F-term + schema checks")
# The server-side vacuavault layer applies the same checks to uploaded parquets
# via vv.validate_parquet_file(path, db=..., physics_checks="explicit").
validate_vacua: 34/34 rows pass F-term + schema checks
designated_ids = db.designate_vacua(
    session.head(5),
    label="ISDplus_demo",
    committed_by="tutorial@example.com",
    model=model_A,
    notes="Vault-workflow tutorial: ISD+ class.",
)
print(f"Designated {len(designated_ids)} vacua -> designated_id = {designated_ids}")

# The designated records carry the derived quantities, incl. the mass spectrum.
designated = db.load_designated(h12=2)
cols = [c for c in ("designated_id", "is_susy", "g_s", "m_gravitino", "mass2")
        if c in designated.columns]
print(designated[cols].head().to_string())
Designated 5 vacua -> designated_id = [0, 1, 2, 3, 4]
   designated_id  is_susy       g_s  m_gravitino                                                                                                                    mass2
0              0     True  0.210754    10.867611     [61.91154144887241, 61.91154144890059, 112.22550541257384, 112.22550541257682, 77779.30650471494, 77779.30650471509]
1              1     True  0.479169     7.212083    [6.425015231616918, 124.11413515460403, 141.35790210319186, 536.1416321558296, 653.5599049121203, 1412.1737268171164]
2              2     True  0.145370     5.081123       [8.05024347339861, 53.67241846783432, 83.20555920324412, 371.81098639263524, 732468.0310313946, 732468.0310313954]
3              3     True  0.118041     8.049246  [6.984435847901165, 24.963176652552953, 31.363983914866616, 123.25828461530197, 351.23630274468866, 470.83995172722484]
4              4     True  0.190678     2.155204    [0.21567492239794464, 14.79320220991362, 608.9696732271706, 840.7558080025154, 23081.02036721397, 23081.020367214023]

Step 3 — Pushing to the HuggingFace vault#

push_vacua_to_hub uploads vacua to the shared aschachner/vacua_vault dataset as a pull request, under a per-model directory derived from the model’s identity. The same call therefore works for two kinds of model:

  • A model from the HF CY-database (LCSDatabase(dataset="tdf"), addressed by ks_id + triang_id) → tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/….

  • A local jaxvacua model (addressed by h12 + model_ID, like model_A) → local/h12_{h12}/model_{model_ID}/community/….

Both require HuggingFace credentials (huggingface-cli login or HF_TOKEN), so the cells below upload only when a token is present.

# (a) Local model (h12 + model_ID) -> local/h12_2/model_1/community/...
if os.environ.get("HF_TOKEN"):
    info = db.push_vacua_to_hub(
        designated, label="ISDplus_demo",
        committed_by="tutorial@example.com",
        model=model_A, create_pr=True)
    print("local push ->", info["file_path"], "| PR:", info.get("pr_url"))
else:
    print("Set HF_TOKEN to push. The local model routes to:")
    print("  local/h12_2/model_1/community/<user>_ISDplus_demo.parquet")
Set HF_TOKEN to push. The local model routes to:
  local/h12_2/model_1/community/<user>_ISDplus_demo.parquet
# (b) A model already in the HF CY-database -> tdf/h12_2/ks_29_tri_1/community/...
if os.environ.get("HF_TOKEN"):
    tdf_db    = LCSDatabase(dataset="tdf")
    tdf_model = tdf_db.load_model(ks_id=29, triang_id=1)
    # Produce + designate vacua for tdf_model exactly as above, then:
    #   tdf_db.push_vacua_to_hub(vacua_df, label=..., committed_by=...,
    #                            model=tdf_model, create_pr=True)
    print("Loaded TDF model; push routes to tdf/h12_2/ks_29_tri_1/community/...")
else:
    print("Set HF_TOKEN (and network) to load a cy-database model; it routes to:")
    print("  tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/<user>_<label>.parquet")
Set HF_TOKEN (and network) to load a cy-database model; it routes to:
  tdf/h12_{h12}/ks_{ks}_tri_{tri}/community/<user>_<label>.parquet

Summary#

Step

Call

Produce

db.vacua_writer(model=…, method="ISD+") + w.append_batch(…)

Validate

db.validate_vacua(df, model=…) (server: vacuavault.validate_parquet_file)

Designate

db.designate_vacua(df, label=…, committed_by=…, model=…)

Push

db.push_vacua_to_hub(df, label=…, committed_by=…, model=…)

The model’s identity decides the vault directory: tdf/… for CY-database models, local/… for local (h12, model_ID) models, cicy/… for CICY.

import shutil
shutil.rmtree(tmpdir, ignore_errors=True)
sj.set_vault_dir(None)        # clears the STRINGFORGE_VAULT override
print("cleaned up sandbox")
cleaned up sandbox