Create and Register a Custom Template

Altay Sansal

Oct 20, 2025

7 min read

Warning

Most SEG-Y files correspond to standard seismic data types or field configurations. We recommend using the built-in templates from the registry whenever possible. Create a custom template only when your file is unusual and cannot be represented by existing templates. In many cases, you can simply customize the SEG-Y header byte mapping during ingestion without defining a new template.

In this tutorial we will walk through the Template Registry and show how to:

  • Discover available templates in the registry

  • Define and register your own template

  • Build a dataset model and convert it to an Xarray Dataset using your custom template

If this is your first time with MDIO, you may want to skim the Quickstart first.

What is a Template and a Template Registry?

A template defines how an MDIO dataset is structured: names of dimensions and coordinates, the default variable name, chunking hints, and attributes to be stored. Since many seismic datasets share common structures (e.g., 3D post-stack, 2D post-stack, pre-stack CDP/shot, etc.), MDIO ships with a pre-populated template registry and APIs to fetch or register templates.

Fetching a template from it returns a copied instance you can freely customize without affecting others.

from mdio.builder.template_registry import get_template
from mdio.builder.template_registry import get_template_registry
from mdio.builder.template_registry import list_templates

registry = get_template_registry()
registry  # pretty HTML in notebooks

TemplateRegistry

Template (16) Default Var Dimensions Chunk Sizes Coords
PostStack2DDepth amplitude cdp, depth 1024×1024 cdp_x, cdp_y
PostStack2DTime amplitude cdp, time 1024×1024 cdp_x, cdp_y
PostStack3DDepth amplitude inline, crossline, depth 128×128×128 cdp_x, cdp_y
PostStack3DTime amplitude inline, crossline, time 128×128×128 cdp_x, cdp_y
PreStackCdpAngleGathers2DDepth amplitude cdp, angle, depth 16×64×1024 cdp_x, cdp_y
PreStackCdpAngleGathers2DTime amplitude cdp, angle, time 16×64×1024 cdp_x, cdp_y
PreStackCdpAngleGathers3DDepth amplitude inline, crossline, angle, depth 8×8×32×512 cdp_x, cdp_y
PreStackCdpAngleGathers3DTime amplitude inline, crossline, angle, time 8×8×32×512 cdp_x, cdp_y
PreStackCdpOffsetGathers2DDepth amplitude cdp, offset, depth 16×64×1024 cdp_x, cdp_y
PreStackCdpOffsetGathers2DTime amplitude cdp, offset, time 16×64×1024 cdp_x, cdp_y
PreStackCdpOffsetGathers3DDepth amplitude inline, crossline, offset, depth 8×8×32×512 cdp_x, cdp_y
PreStackCdpOffsetGathers3DTime amplitude inline, crossline, offset, time 8×8×32×512 cdp_x, cdp_y
PreStackCocaGathers3DDepth amplitude inline, crossline, offset, azimuth, depth 8×8×32×1×1024 cdp_x, cdp_y
PreStackCocaGathers3DTime amplitude inline, crossline, offset, azimuth, time 8×8×32×1×1024 cdp_x, cdp_y
PreStackShotGathers2DTime amplitude shot_point, channel, time 16×32×2048 source_coord_x, source_coord_y, group_coord_x, group_coord_y
PreStackShotGathers3DTime amplitude shot_point, cable, channel, time 8×1×128×2048 source_coord_x, source_coord_y, group_coord_x, group_coord_y, gun

We can list all registered templates and get a list as well.

list_templates()
['PostStack2DTime',
 'PostStack2DDepth',
 'PostStack3DTime',
 'PostStack3DDepth',
 'PreStackCdpOffsetGathers3DTime',
 'PreStackCdpOffsetGathers2DTime',
 'PreStackCdpAngleGathers3DTime',
 'PreStackCdpAngleGathers2DTime',
 'PreStackCdpOffsetGathers3DDepth',
 'PreStackCdpOffsetGathers2DDepth',
 'PreStackCdpAngleGathers3DDepth',
 'PreStackCdpAngleGathers2DDepth',
 'PreStackCocaGathers3DTime',
 'PreStackCocaGathers3DDepth',
 'PreStackShotGathers2DTime',
 'PreStackShotGathers3DTime']

Defining a Minimal Custom Template

To define a custom template, subclass AbstractDatasetTemplate and set:

  • _name: a public name for the template

  • _dim_names: names for each axis of your data variable (the last axis is the trace/time or trace/depth axis)

  • _physical_coord_names and _logical_coord_names: optional additional coordinate variables to store along the spatial grid

  • _load_dataset_attributes(): optional attributes stored at the dataset level

Below we create a special template that can hold interval velocity field with multiple anisotropy parameters for a depth seismic volume.

The dimensions, dimension-coordinates and non-dimension coordinates will automatically get created using the method from the base class. However, since we want more variables, we override _add_variables to add them.

from mdio.builder.schemas import compressors
from mdio.builder.schemas.chunk_grid import RegularChunkGrid
from mdio.builder.schemas.chunk_grid import RegularChunkShape
from mdio.builder.schemas.dtype import ScalarType
from mdio.builder.schemas.v1.variable import VariableMetadata
from mdio.builder.templates.base import AbstractDatasetTemplate


class AnisotropicVelocityTemplate(AbstractDatasetTemplate):
    """A custom template that has unusual dimensions and coordinates."""

    def __init__(self, data_domain: str = "depth") -> None:
        super().__init__(data_domain)
        # Dimension order matters; the last dimension is the depth
        self._dim_names = ("inline", "crossline", self.trace_domain)
        # Additional coordinates: these are added on top of dimension coordinates
        self._physical_coord_names = ("cdp_x", "cdp_y")
        self._var_chunk_shape = (128, 128, 128)
        self._units = {}

    @property
    def _name(self) -> str:  # public name for the registry
        return "AnisotropicVelocity3DDepth"

    @property
    def _default_variable_name(self) -> str:  # public name for the registry
        return "velocity"

    def _load_dataset_attributes(self) -> dict:
        return {"surveyType": "3D", "gatherType": "line"}

    def _add_variables(self) -> None:
        """Add the variables including default and extra."""
        for name in ["velocity", "epsilon", "delta"]:
            chunk_grid = RegularChunkGrid(configuration=RegularChunkShape(chunk_shape=self.full_chunk_shape))
            unit = self.get_unit_by_key(name)
            self._builder.add_variable(
                name=name,
                dimensions=self._dim_names,
                data_type=ScalarType.FLOAT32,
                compressor=compressors.Blosc(cname=compressors.BloscCname.zstd),
                coordinates=self.physical_coordinate_names,
                metadata=VariableMetadata(chunk_grid=chunk_grid, units_v1=unit),
            )


AnisotropicVelocityTemplate()

AnisotropicVelocityTemplate

Template Name: AnisotropicVelocity3DDepth
Data Domain: depth
Default Variable: velocity
Default Variable Units:
Dimensions (3)
Name Size Chunk Sizes Units Spatial
inline 128
crossline 128
depth 128
Coordinates (2)
Name Type Units
cdp_x Physical
cdp_y Physical

Registering the Custom Template

The registry returns a deep copy of the template on every fetch. To make the template discoverable by name, register it first, then retrieve it with get_template.

from mdio.builder.template_registry import register_template

register_template(AnisotropicVelocityTemplate())
print("Registered:", "AnisotropicVelocity3DDepth" in list_templates())

custom_template = get_template("AnisotropicVelocity3DDepth")
custom_template
Registered: True

AnisotropicVelocityTemplate

Template Name: AnisotropicVelocity3DDepth
Data Domain: depth
Default Variable: velocity
Default Variable Units:
Dimensions (3)
Name Size Chunk Sizes Units Spatial
inline 128
crossline 128
depth 128
Coordinates (2)
Name Type Units
cdp_x Physical
cdp_y Physical

You can also set units at any time. For this demo we’ll set metric units. The spatial units will be inferred from the SEG-Y binary header during ingestion, but we can override them here. Ingestion will honor what is in the template.

from mdio.builder.schemas.v1.units import LengthUnitModel
from mdio.builder.schemas.v1.units import SpeedUnitModel

custom_template.add_units(
    {
        "depth": LengthUnitModel(length="m"),
        "cdp_x": LengthUnitModel(length="m"),
        "cdp_y": LengthUnitModel(length="m"),
        "velocity": SpeedUnitModel(speed="m/s"),
    }
)
custom_template

AnisotropicVelocityTemplate

Template Name: AnisotropicVelocity3DDepth
Data Domain: depth
Default Variable: velocity
Default Variable Units: m/s
Dimensions (3)
Name Size Chunk Sizes Units Spatial
inline 128
crossline 128
depth 128 m
Coordinates (2)
Name Type Units
cdp_x Physical m
cdp_y Physical m

Changing chunk size (chunks) on an existing template

Often you will want to tweak the chunking strategy for performance. You can do this in two ways:

  • When defining a subclass, set a default in the constructor (e.g., self._var_chunk_shape = (...)).

  • On an existing template instance, assign to the full_chunk_shape property once you know your final dataset sizes (the tuple length must match the number of data dimensions).

Below is a tiny demo showing how to modify the chunk shape on a fetched template. We first build the template with known sizes to satisfy validation, then update full_chunk_shape.

Note

In the SEG-Y to MDIO conversion workflow, MDIO infers the final grid shape from the SEG-Y headers. It’s common to set or adjust full_chunk_shape right before calling segy_to_mdio, using the same sizes you expect for the final array.

mdio_ds = custom_template.build_dataset(name="demo-only", sizes=(300, 500, 1001))
# pick smaller chunks than the full array for better parallelism and IO
custom_template.full_chunk_shape = (64, 64, 64)
print("Chunk shape set to:", custom_template.full_chunk_shape)

custom_template
Chunk shape set to: (64, 64, 64)

AnisotropicVelocityTemplate

Template Name: AnisotropicVelocity3DDepth
Data Domain: depth
Default Variable: velocity
Default Variable Units: m/s
Dimensions (3)
Name Size Chunk Sizes Units Spatial
inline 300 64
crossline 500 64
depth 1,001 64 m
Coordinates (2)
Name Type Units
cdp_x Physical m
cdp_y Physical m

Making Dummy Xarray Dataset

We can now take the MDIO Dataset model and convert it to Xarray with our configuration. If ingesting from SEG-Y, this step gets executed automatically by the converter before populating the data.

Note that the whole dataset will be populated with the fill values.

from mdio.builder.xarray_builder import to_xarray_dataset

to_xarray_dataset(mdio_ds)
<xarray.Dataset> Size: 2GB
Dimensions:     (inline: 300, crossline: 500, depth: 1001)
Coordinates:
  * inline      (inline) int32 1kB 2147483647 2147483647 ... 2147483647
  * crossline   (crossline) int32 2kB 2147483647 2147483647 ... 2147483647
  * depth       (depth) int32 4kB 2147483647 2147483647 ... 2147483647
    cdp_x       (inline, crossline) float64 1MB dask.array<chunksize=(300, 500), meta=np.ndarray>
    cdp_y       (inline, crossline) float64 1MB dask.array<chunksize=(300, 500), meta=np.ndarray>
Data variables:
    velocity    (inline, crossline, depth) float32 601MB dask.array<chunksize=(300, 256, 256), meta=np.ndarray>
    epsilon     (inline, crossline, depth) float32 601MB dask.array<chunksize=(300, 256, 256), meta=np.ndarray>
    delta       (inline, crossline, depth) float32 601MB dask.array<chunksize=(300, 256, 256), meta=np.ndarray>
    trace_mask  (inline, crossline) bool 150kB dask.array<chunksize=(300, 500), meta=np.ndarray>
Attributes:
    apiVersion:  1.0.8
    createdOn:   2025-10-20 15:51:50.947182+00:00
    name:        demo-only
    attributes:  {'surveyType': '3D', 'gatherType': 'line', 'defaultVariableN...

Recap: Key APIs Used

  • Template registry helpers: get_template_registry, list_templates, register_template, get_template

  • Base template to subclass: AbstractDatasetTemplate

  • Make Xarray Dataset from MDIO Data Model: to_xarray_dataset

With these pieces, you can standardize how your seismic data is represented in MDIO and keep ingestion code concise and repeatable.