Design Choices
==============

This page captures core design decisions in Grilly and their tradeoffs.

1. Vulkan-first backend
-----------------------

Choice:
  Use Vulkan compute shaders as the primary acceleration layer.

Why:
  Vulkan enables cross-vendor GPU support (AMD, NVIDIA, Intel) and avoids
  hard coupling to a single vendor-specific runtime.

Tradeoff:
  Kernel development and debugging are lower-level than CUDA-only frameworks.

2. PyTorch-like UX with explicit internals
------------------------------------------

Choice:
  Expose familiar module/functional/optimizer APIs while keeping backend
  controls visible.

Why:
  Lowers migration cost for users and keeps performance-critical internals
  accessible for research and profiling.

Tradeoff:
  Some flows are more explicit (for example, manual output gradients in certain
  training paths) compared with end-to-end autograd frameworks.

3. CPU fallback paths
---------------------

Choice:
  Implement CPU fallbacks when shaders or Vulkan features are unavailable.

Why:
  Improves portability, simplifies development, and keeps tests runnable in
  constrained environments.

Tradeoff:
  Behavior can be slower or numerically slightly different depending on path.

4. NumPy as primary tensor interchange
--------------------------------------

Choice:
  Standardize external API boundaries around NumPy arrays.

Why:
  NumPy is ubiquitous, simple for integration, and predictable for serialization
  and experiment tooling.

Tradeoff:
  Interop with other tensor runtimes sometimes requires conversion steps.

5. Specialized subsystems in one framework
------------------------------------------

Choice:
  Include SNN, cognitive, VSA, multimodal, and retrieval subsystems together.

Why:
  Enables hybrid research workflows without glue code across many libraries.

Tradeoff:
  Broader surface area increases documentation and maintenance complexity.

6. Determinism tools for experimental pipelines
-----------------------------------------------

Choice:
  Add stable hashing and compressed ingestion checkpoints in `utils`.

Why:
  Reproducibility and resumability are critical for long-running language and
  cognition experiments.

Tradeoff:
  Added format/version management responsibilities.