A remote GPU that feels local.

One command to get an H100. Your folder syncs both ways. When you're done, deploy as a production endpoint.

H100/A100

GPU fleet

<30s

Provisioning

4 clouds

Capacity routing

Per-second

Billing

terminal

$ cassian up --gpu h100

Provisioning... ready (28s)

$ cassian ssh

root@cassian-h100-01:/workspace$ python train.py

[Epoch 12/12] loss: 0.0023 | 14m 32s

root@cassian-h100-01:/workspace$ exit

$ cassian deploy

https://api.cassian.cloud/v1/my-model

Everything that sucks about GPU cloud, fixed.

No quota tickets. No SSH sprawl. No lost work. No paying while nothing runs.

Instant provisioning

One command gets you an H100 in under 30 seconds. No quota requests, no waiting days for approval.

Bidirectional sync

One working tree. Your local folder stays in sync with the remote GPU. No SCP, no manual uploads.

Develop from any machine

Mac, Windows, Linux. Edit locally, run on a GPU. We handle SSH, sync, and environment setup.

Persistent volumes

Spot instance goes down, volume reattaches on a new machine. Your work survives across sessions.

Multi-cloud routing

Picks the cheapest available GPU across CoreWeave, GCP, AWS, and Azure. You never file a quota ticket.

One-command deploy

Training done? Run one command. HTTPS endpoint with TLS, autoscaling, and custom domains.

Three commands. That's it.

Step 1

Define your environment

One YAML file. GPU type, CUDA version, disk, dependencies, what to sync. Replaces Dockerfiles, shell scripts, and setup docs.

cassian.yml

name: llama-finetune

gpu: h100

cuda: "12.4"

disk: 500GB

python: "3.11"

sync:

local: ./

remote: /workspace

exclude: [".git", "data/raw"]

pip:

- torch>=2.3

- transformers

- accelerate

- wandb

Step 2

Start working

Your folder is on an H100 in 30 seconds. Files sync both ways. Use SSH, VS Code, or Cursor to edit and run.

terminal

$ cassian up

Provisioning... ready (28s)

Sync: ./ ↔ /workspace

$ cassian ssh

root@cassian-h100-01:/workspace$ python train.py

[Epoch 1/12] loss: 2.341

[Epoch 2/12] loss: 1.847

...

[Epoch 12/12] loss: 0.003

Step 3

Ship it

Training done. One more command turns your model into a production inference endpoint with TLS and autoscaling.

terminal

$ cassian deploy --domain api.mycompany.com

Deploying...

✓ Live

https://api.mycompany.com/v1/llama-ft

$ curl https://api.mycompany.com/v1/llama-ft \

-d '{"prompt": "Explain transformers"}'

{"response": "Transformers are a neural..."}

Stop paying for GPUs that don't work.

We're working with teams who are tired of pods that crash, uploads that fail, and credits that disappear.

Talk to us