Usage

The intended way to take advantage of Cedana is to use it for cloud arbitrage. For use cases directly related to the checkpoint/migrate functionality, refer to the client code repo.

Install

You can either pull from source and build with:

$ git clone git@github.com:cedana/cedana-cli.git
$ cd cedana-cli
$ go build

or install using our brew or apt sources.

Ubuntu

$ curl -s https://packagecloud.io/install/repositories/nravic/cedana-cli/script.deb.sh | sudo bash
$ sudo apt install cedana-cli

MacOS

$ brew tap cedana/cedana-cli
$ brew install cedana-cli

Both sources currently update frequently, so use the respective update commands to keep the software up to date.

Getting Started

The user experience is currently a work in progress, but very much top of mind. It is our hope that Cedana is as pain-free to use as possible, and so a lot of the jank involved with setting up providers will hopefully go away once more development is put in. Expect a bigger and better product every week!

Currently, Cedana supports running instances and migrating work between AWS and Paperspace (with more clouds to be added).

The fastest way to get started is to run

$ cedana-cli bootstrap

which guides you through setting up a cloud provider of your choice locally. This generates a configuration file found in ~/.cedana/cedana_config.json. A sample configuration file with all the bells and whistles would look like this:

{
   "self_serve": true,
   "cedana_managed": true,
   "checkpoint": {
      "heartbeat_enabled": true,
      "heartbeat_interval_seconds": 60
   },
   "enabled_providers": [
      "aws",
      "paperspace"
   ],
   "shared_storage": {
      "mount_point": "/home/ubuntu/.cedana/",
      "dump_storage_dir": "/home/ubuntu/.cedana/"
 },
   "aws" : {
   "enabled_regions": ["us-east-1","us-west-1"],
   "enabled_instance_families": ["t2"],
   "ssh_key_path": "/Users/nravichandra/.cedana/test.pem",
   "launch_template": "caltech-test"
   },
   "paperspace": {
      "api_key": "someapikey",
      "ssh_key_path": "/home/nravic/.ssh/id_ed25519",
      "enabled_regions": ["East Coast (NY2)"]
   },
   "connection": {
      "nats_url": "0.0.0.0",
      "nats_port": 4222,
      "auth_token": "test"
   }
}

See Configuration for more details.

Note

If you’re having issues setting up cloud providers, take a look at Providers which has instructions on setting them up manually.

To deploy work onto an instance, you’ll need a job and a NATS Jetstream deployment. The NATS deployment allows for coordination between the orchestrator and runner, and acts as a simple message broker.

Deploying a NATS jetstream image is as simple as running:

$ docker run --network host -p 4222:4222 nats -js --auth YOURAUTHTOKEN

on a pre-provisioned instance. You use our NATS server if enrolled in our managed service. This doc is a great resource for more information regarding deploying your own NATS cluster.

After it’s deployed, your connection block in cedana_config.json would change to:

{
   "connection": {
         "nats_url": "INSTANCE_URL",
         "nats_port": 4222,
         "auth_token": "YOURAUTHTOKEN"
    }
}

A sample definition is given below:

instance_specs:
   memory_gb: 12
   cpu_cores: 2
   max_price_usd_hour: 0.2
work_dir: "/path/to/work/dir"
setup:
        run:
      - "sudo apt-get --yes install jupyter git binutils"
task:
   run:
      - "echo foo"
restored_task:
   run:
      - ""

You can see the flow of “work” in Cedana. A point to highlight is the restored_task field, which allows you to utilize any internal or application checkpointing in addition to cedana process checkpoints to restore state in the event of failure or migration.

To run your work, it’s as simple as pointing cedana-cli to your job spec:

$ cedana-cli run job.yml

Cedana then walks through the following steps:

  • Looking across the selected providers and optimizing for machines that match the provided specifications

  • Launching a worker and associated orchestrator daemon (running locally if self-serve = true) to coordinate migrations

  • Setting up the instances

  • Running the work

Below, you can see a quick demo of everything in action (including checkpoint/restore across machines).

Cedana Launch Demo