Skip to content

Latest commit

 

History

History
1426 lines (1358 loc) · 69 KB

4_cardano_node.org

File metadata and controls

1426 lines (1358 loc) · 69 KB

This is document to capture standing up nodes in cardano using divnix and terraform

  • This document belongs with https://github.com/bernokl/nix-ops-node
  • We are going to explore standing up cardano nodes using out existing divnix and terraform pattern
  • The end goal is autonomous deploy of iohk/cardano-node flakes to have relay and then blockproducers.

Stand up relay node

  • Here is the steps I aim to follow:
- Deploy nixos ec2 instance in aws using terraform
- Clone the cardano-node repository.
- In the cardano-node repository, create a new file called configuration.nix.
- In the configuration.nix file, add the following code:
       {
         imports = [
           "github:input-output-hk/cardano-node?ref=master"
         ];
       }
- Run the following command to build the cardano node:
       nix build github:input-output-hk/cardano-node?ref=master
- Once the cardano node is built, you can start it by running the following command:
       nix run github:input-output-hk/cardano-node?ref=master run
- Here are some additional details about the instructions above:
  - The imports section of the configuration.nix file specifies the Nix flakes that the cardano node depends on.
  - The nix build command builds the cardano node from the Nix flakes that are specified in the imports section.
  - The nix run command starts the cardano node.
  - The cardano-cli tool is used to interact with the cardano node.
  • In my repo I am going to copy the terrafor/cache-server to make my relay-node folder.
  • I am going to strip the terraform to give me just aws instance:
  • I enable envrc with:
direnv allow .
  • That loads my aws keys into env
  • TODO: Ongoing reminder that we need to think about credentials.
  • Run init
aws_terraform_init
  • Apply:
aws_terraform_apply
  • Grab ip from aws-console ssh in
ssh -i id_rsa.pem root@xx.xx.xx.xx
  • OK, lets run:
nix build github:input-output-hk/cardano-node?ref=master
  • Build was less than 3 minuts lets try the run
  • Lets pass in run
nix run github:input-output-hk/cardano-node?ref=master run
  • New error! OO this is why they want you to clone the rope first, lets go look
InvalidYaml (Just (YamlException "Yaml file not found: configuration/cardano/mainnet-config.json"))

cardano-node: YAML exception:
Yaml file not found: configuration/cardano/mainnet-config.json
  • Clone repo to our server
git clone https://github.com/input-output-hk/cardano-node.git
  • cd
cd cardano-node
  • Lets create the configuration.nix
{
  imports = [
    "github:input-output-hk/cardano-node?ref=master"
  ];
}

  • Run from inside repo
nix run github:input-output-hk/cardano-node?ref=master run
  • Boom, we have a running relay.
  • The above should be very easy to add to user_data in terraform.
  • Lets strip user_data out to file and:
    • Clone repo
    • Add our configuration.nix
    • Build
    • Run
  • Lets update the main.tf to have this:
user_data = "${file("start_node.sh")}"
  • And lets go crate a start_node.sh
#!env bash -xe
git clone https://github.com/input-output-hk/cardano-node.git &&
cd cardano-node
cat << 'EOF' > configuration.nix
{
  imports = [
    "github:input-output-hk/cardano-node?ref=master"
  ];
}
EOF
yes | nix build github:input-output-hk/cardano-node?ref=master && 
echo node_done_building > /tmp/outNix
yes | nix run github:input-output-hk/cardano-node?ref=master run
  • Now we destroy the host and start the apply again so that start_node.sh can run.
  • We can manually build and start a cardano relay by running: aws_terraform_apply
  • Right now the node is up, and I see a process: nix build github:input-output-hk/cardano-node?ref=master
  • If I strace there is activity, also we are steadily using more disk space.
  • I do not understand why my manual build was so much quicker.
  • It has been building for almost exactly 2 hours. I do see the load is 12 on 4 cores meaning the cpu is not nearly keeping up.
  • I think I might have scaled to 2xlarge or even 4xlarge for the build phase in diypool, might have to consider doing the same here.
  • Will leave it to run for now, I wish I had a sense of % done
  • It took a couple of hours of google and play, but came up with:
nix build --accept-flake-config github:input-output-hk/cardano-node?ref=master
  • I got it to work, but still have some unexpected behaviuor.
  • Let me destroy and rebuild with everything vanilla then try to run nix build –accept-flake-config on clean os
  • YAS, running this manually builds a node I can run in 5 minutes
nix build --accept-flake-config github:input-output-hk/cardano-node?ref=master
  • OK going to destroy and build with that in my startup_node.sh
nix build --accept-flake-config github:input-output-hk/cardano-node?ref=master &&
echo we_got_clean_build > /tmp/outNix
nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run
  • This builds the node! and I think starts it, BUT it does not say running.
  • The matrue solution would be to have the run executed by a daemon service like system.d.
  • At this point I fell into a multi hour investigation into adding traditional /etc/systemd/system/*.service I could run.
  • Turns out nix wants you to define your service in configurattion.nix
  • It seems like you add it to configurion something like:
config.systemd.services.interosEsMdb = {
  description = "Interos MongoDB+ES log capture";
  after = ["network.target"];
  wantedBy = ["multi-user.target"];

  serviceConfig = {
    # change this to refer to your actual derivation
    ExecStart = "${interosEsMdb}/bin/syslog-exec.sh";
    EnvironmentFile = "${interosEsMdb}/lib/es-service.env";
    Restart = "always";
    RestartSec = 1;
  }
  • Lots of itteration later I ended with this:
systemd.services.cardano-node-relay-daemon = {
  enable = true;
  description = "Cardano relay daemon";
  after = ["network.target"];
  wantedBy = ["multi-user.target"];

  serviceConfig = {
    ExecStart = "${pkgs.nix}/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run";
    Restart = "always";
    User = "root";
    WorkingDirectory="/cardano-node/";
    RestartSec = 1;
  };
};
  • I also needed to add line to startup_node.sh to start the service
systemctl start cardano-node-relay-daemon.service
  • Nice cheat to find aws ec2 external ip, replace running with tag or other metadata you car about
aws ec2 describe-instances --filters 'Name=instance-state-name,Values=running' --query 'Reservations[*].Instances[*].[InstanceId,PublicIpAddress]' --output text
  • returns:
i-01a7e8d4e89049894     13.239.136.44
  • And on this host I see the daemon running:
> systemctl status cardano-node-relay-daemon.service 
● cardano-node-relay-daemon.service - Cardano relay daemon
     Loaded: loaded (�]8;;file://ip-172-31-19-21.ap-southeast-2.compute.internal/etc/systemd/system/cardano-node-relay-daemon.service/etc/systemd/system/cardano-node-relay-daemon.service�]8;;; enabled; preset: enabled)�]8;;
     Active: active (running) since Mon 2023-05-08 13:47:51 UTC; 4h 24min ago
   Main PID: 2101 (cardano-node)
         IP: 15.2G in, 157.6M out
         IO: 316.0K read, 18.3G written
      Tasks: 16 (limit: 9155)
     Memory: 6.1G
        CPU: 8h 50min 11.556s
     CGroup: /system.slice/cardano-node-relay-daemon.service
             └─2101 /nix/store/0ndig34c9qizj3g4z1s1scwk3pxcvfzn-cardano-node-exe-cardano-node-8.0.0/bin/cardano-node>

May 08 18:12:16 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:18 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:19 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:20 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:21 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:23 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:24 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:25 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:26 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
May 08 18:12:28 ip-172-31-19-21.ap-southeast-2.compute.internal nix[2101]: [ip-172-3:cardano.node.ChainDB:Notice:35]>
  • Robert suggeted I focus on tailscale integration next as he already had the template on hokioi
  • Here is the updates I made to get the ec2 instance running tailscale and connecting to yumi network
 { config, lib, pkgs, modulesPath, ... }:
+let
+   system.autoUpgrade.channel = "https://nixos.org/channels/nixos-unstable";
+   nixos-unstable = import <nixos-unstable> {};
 
-{
+in {
   imports = [ "${modulesPath}/virtualisation/amazon-image.nix" ];
 
   ec2.hvm = true;
@@ -26,10 +29,47 @@
     };
   };
 
+ services.tailscale.enable = true;
+
+ systemd.services.tailscale-autoconnect = {
+    description = "Automatic connection to Tailscale";
+
+    # make sure tailscale is running before trying to connect to tailscale
+    after = [ "network-pre.target" "tailscale.service" ];
+    wants = [ "network-pre.target" "tailscale.service" ];
+    wantedBy = [ "multi-user.target" ];
+
+    # set this service as a oneshot job
+    serviceConfig.Type = "oneshot";
+
+    # have the job run this shell script
+    script = with pkgs; ''
+      # wait for tailscaled to settle
+      sleep 2
+
+      # check if we are already authenticated to tailscale
+      status="$(${tailscale}/bin/tailscale status -json | ${jq}/bin/jq -r .BackendState)"
+      if [ $status = "Running" ]; then # if so, then do nothing
+        exit 0
+      fi
+
+      # otherwise authenticate with tailscale
#
+      ${tailscale}/bin/tailscale up --ssh -authkey tskey-auth-########
+    '';
+};
+
+  networking.firewall = {
+    checkReversePath = "loose";
+    enable = true;
+    trustedInterfaces = [ "tailscale0" ];
+    allowedUDPPorts = [ config.services.tailscale.port ];
+  };
+
+  networking.hostName = "aws-1";
+  networking.domain = "husky-ostrich.ts.net";

   environment.systemPackages = with pkgs; [
     git
     vim
     htop
+    tailscale
     lsof
   ];
 }

  • Note we give the machine a networking.hostName, that registers the name we want for this machine in tailscale
  • Also VERY important once it is connected to tailscale your ssh sessions over the 10. network with be authenticated through tailscale.
  • This is a very important bennefit.
  • Also very important, the authkey used needs to be set to be ephemiral, pre-authenticate the hosts and assign tags we want for the machines.
  • This makes management very simple, but needs to be carefully managed.
  • We will integrate SOPS/1Password/Key-store to hold keys we can then hydrate on host with env-vars in our session.
  • Trying to do some testing, lets start with what we can see on the node:
journalctl -u cardano-node-relay-daemon.service

May 10 18:08:04 aws-1 nix[2100]: Event: LedgerUpdate (HardForkUpdateInEra S (S (Z (WrapLedgerUpdate {unwrapLedgerUpdate = ShelleyUpdatedProtocolUpdates []}))))
May 10 18:08:04 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:04.96 UTC] Chain extended, new tip: 2198c40091993baed54b4638473327b0b77c5dccaa56768690f5b56>
May 10 18:08:06 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:06.21 UTC] Chain extended, new tip: 2e7ccf635d45201aaf52c5a2e7e10f7c5b90a2ca5ed10356210859e>
May 10 18:08:07 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:07.46 UTC] Chain extended, new tip: 18ba572b54363a6bcb43bccb283828ab559cd5ddf7d73c2b0c07c51>
May 10 18:08:08 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:08.71 UTC] Chain extended, new tip: 87cfa4a2f2258217adbde872e2ab53906f43d95e1f9fbbbc4dc362a>
May 10 18:08:09 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:09.53 UTC] Chain extended, new tip: 8add136563b0c36616f42ad52815e3c666be1b8b087e4268949395c>
May 10 18:08:09 aws-1 nix[2100]: Event: LedgerUpdate (HardForkUpdateInEra S (S (Z (WrapLedgerUpdate {unwrapLedgerUpdate = ShelleyUpdatedProtocolUpdates [ProtocolUpdate {protocolUpda>
May 10 18:08:09 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:09.54 UTC] Chain extended, new tip: e6878f21c35b5c9c233bf54207c28dbeb0743c6cf19d1468afc78c6>
May 10 18:08:10 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:10.79 UTC] Chain extended, new tip: 92fc5c7b7e6b8a84623787b2b3a52400d9388958258e7cdf57c9d6f>
May 10 18:08:12 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:12.04 UTC] Chain extended, new tip: f02680481aa08d0d53b4d1574d063fcba04d8b10b8e91ef07c52737>
May 10 18:08:13 aws-1 nix[2100]: [aws-1:cardano.node.ChainDB:Notice:35] [2023-05-10 18:08:13.29 UTC] Chain extended, new tip: 83fdc9ce9b078a25880964ab7c87bf5e66a73849fbe06918580f738>
  • That seems positive confirmation that we are participating in the network traffic.
  • Going to try to query the node using cardano-cli turns out to be a pain.
  • First I decide to install it like this:
git clone https://github.com/input-output-hk/cardano-node.git
cd cardano-node/cardano-cli 
cabal update 
cabal build
cabal install
/root/.cabal/bin/cardano-cli --version
#returns: cardano-cli 8.1.0 - linux-x86_64 - ghc-8.10
/root/.cabal/bin/cardano-cli query tip --mainnet
# returns Missing --socket-path SOCKET_PATH
  • I have been trying to find the socket path for this node for too long.
  • Next options is to specify –socket-path when I start it up or keep trying to resolve this absurd roadblock.
  • I do feel a bit abstracted from what I have deployed I am sure there has to be default socket path, I even tried mlocate for
  • I noticed two ports bound to 127.0.0.1, running curl 127.0.0.1:12788 I can see there is a web page
  • Lets forward it with socat so I can hit that address over tailscale.
nix-env -i socat
  socat TCP-LISTEN:5000,reuseaddr,fork TCP:127.0.0.1:12788
  • Now I go to my laptop that is logged into tailscale and visit http://100.xx.xx.72:5000
  • YAS I have a very nice dashboard with residency (memory?) allocation rate and productivity. I am not 100% what these relate to but they seem healthy.
  • Mmmm I feel like I might want to leave the cardano-cli to someone who understands it better like Jack or Robert.

#current

  • We can now deploy a cardano-relay in AWS using, terraform init, terraform user_data (start_node.sh) and configuration.nix
  • The node gets started as a service we define in configuration.nix “systemctl status cardano-node-relay-daemon.service”
  • The new machine registers itself in tailscale, you can use tailscales to authenticate ssh over the 10. network, you can find machine by ip or networking.hostName
  • I can see healty logs with work the node is doing, I can see a dashboard with healty metrics, I still need to query and cli interact with the server
  • Next step is cli testing of the node. I think I will ask for help from Jack for this one

Update terraform structure of nix-ops-node to implement terragrunt.

  • First lets add some folders: accounts/sandbox/ap-southeast2
  • Now we move node-relay into that region ie accounts/sandbox/ap-southeast2/node-relay-1
  • The 5 files in that directory (main.tf, terragrunt.hcl, configuration.nix, start_node.sh, variables.tf) will allow you to spin up a machine
  • Here is our layout:
terraform
├── modules
│   ├ node-relay (goal would be to set common features here)
│   ├── main.tf
│   ├── variables.tf
├── accounts
│   ├ sandbox
│   ├── terragrunt.hcl(This sets up our .tfstate in s3, long term this would keep common configs like, machine type or other sandbox components like security_groups?)
│   ├── ap-southeast-2
│   ├────── terragrunt.hcl(I did not keep, but do we have common components that would be here?) 
│   ├────── node-relay-1 (This works as is in the repo)
│   ├───────  main.cf (This is self contained and works without the module)
│   ├───────  configuration.nix (This contains initial machine state including tailscale and setting up iohk/cardano-node service)
│   ├───────  start_node.sh ( This imports cardano-node referer, builds the flake and then starts the service we set up in configuration.nix)
│   ├───────  terragrunt.hcl (This passes in variables, for this POC it just passes in machine type, but can be expanded.
│   ├───────  variables.tf (Defines the variables used by the module)
│   ├────── node-relay-2 (This is experiment to be more DRY, it would mean less duplication of code in main.tf by re-using what we have in modules)
│   ├───────  main.cf (Sources modules/node-relay and passes in variblees it needs.)
│   ├───────  configuration.nix (Same as above)
│   ├───────  start_node.sh (Same as above)
│   ├───────  terragrunt.hcl (sources modules/node-relay *note the source in main.tf should not be needed, but this is still in testing, provides values to varibles.
│   ├───────  variables.tf (Defines the variables used by the module)
│   ├── ap-southeast-2
│   ├────── terragrunt.hcl(I did not keep, but do we have common components that would be here?) 
│   ├────── node-relay-1 (This works as is in the repo)
│   ├───────  main.cf (This is self contained and works without the module)
│   ├───────  configuration.nix (This contains initial machine state including tailscale and setting up iohk/cardano-node service)
│   ├───────  start_node.sh ( This imports cardano-node referer, builds the flake and then starts the service we set up in configuration.nix)
│   ├───────  terragrunt.hcl (This passes in variables, for this POC it just passes in machine type, but can be expanded.
│   ├───────  variables.tf (Defines the variables used by the module)

  • Before you can deploy anything in the repo you will need to replace the tailscale key in configuration.nix and the whitelist ip address to one that will ssh in.
  • TODO: Decide if we want to allow ssh outside tailscale perhaps not? The applied machine is accessible from tailscale, we can alwaays manually add whiteliting if we can not get to it from tailscale
  • Apply your changes with:
terragrunt init &&
terragrunt apply
  • node-relay-1 works as expected we still manually add TS-key, whitelist ip and end up with SSH key locally.
  • TODO: incorporate sops to handle keys and secrets
  • node-relay-2 is WORKING! (can you tell it was a pain?)
  • node-relay-2 is more DRY, main.tf is only call out to the module.
  • Next I am going to create copies of relay-1 in 2 other regions provide each with unique name and key see how we can interact between them.
  • Our 3 regions for this proof of concept (can always change in new regions)
ap-southeast-2 (Asia Pacific Sydney)
eu-north-1 (Europe Stockholm) 
ap-south-1 (Asia Pacific Mumbai)
  • Here are the steps to create a new region:
  • Copy a known good directory - I am copying sanbox/ap-southeast-2 naming the copy ap-south-1
  • I am going to re-use only structure of node-relay-1 because I only need one node in each region, and architectually I am not sold on making nodes into a module, I want this to be easilly replicated to other infrastructures. I am wondering if the directory structure is too complex. I am keeping it for ease of remote state management, but I am open to improvement suggestions.
  • main.tf:
- You will need to update the provider.region to your new region
- Make sure the nix_image.source has reference to a commit that has an image specified for the region you are setting up
- Make sure any ip's you need whitelisted is in aws_security_group.ssh_and_egress.from_port=22 cidr_blocks, 
- TODO if it is a relay-node we should be able to reach over tailscale then perhaps we want to get rid of port 22 whitelisting in aws_security_group
- For aws_instance.machine.subnet_id you will currently need to go look this up in aws-console-vpc-subnets
- Make sure you are using the right tskey-auth-xxx 
- Update the networking.hostName 
  • We will add a .envrc to this folder to update our AWS_REGION for ap-south1 I did:
# I source the original envrc that gets unloaded when I cd in here
source ../../../../.envrc
# I add the region we will be using
export AWS_REGION=ap-south-1
  • This works to deploy a new node to ap-south-1, lets do the same for eu-north-1
  • Made copy of ap-south1 to create eu-north1
  • Made the same updates to main.tf, configuration.nix and .envrc
  • I set my region in aws by running “direnv allow .” from the eu-north1/
  • I confirm I am pointing to the right region with
aws configure get region
  • Next I run terragrunt init/apply
  • And I see my new aws-eu-n-1-1 node in tailscale, and YUP it has a running node.

Block producer

  • Tomorrow morning will set up folder structure using relay, update to create producer
  • Here is the steps bard.google.com comes up with:
  • Notice it is lying there is no install.sh for the cardano node, but I do see cardano-node.cabal, I will stick with flake build run.
The recommended steps for deploying a Cardano block producer according to the Cardano node repository on GitHub are as follows:

- Install the latest version of Cardano node.
- Create a new directory for your block producer node.
- Copy the config.yaml file from the Cardano node repository to your new directory.
- Edit the config.yaml file to specify your pool's details, such as its name, ID, and keys.
- Start the relay node.
- Start the block producer node.
- The following are some additional details about each step:

- To install the latest version of Cardano node, you can use the following command:
      Code snippet
      curl -sSLO https://github.com/input-output-hk/cardano-node/master/install.sh
      bash install.sh
      Use code with caution. Learn more
      To create a new directory for your block producer node, you can use the following command:
      Code snippet
      mkdir my-pool
      Use code with caution. Learn more
- To copy the config.yaml file from the Cardano node repository to your new directory, you can use the following command:
      Code snippet
      cp https://github.com/input-output-hk/cardano-node/master/config.yaml my-pool
      Use code with caution. Learn more
- To edit the config.yaml file, you can use a text editor, such as Nano or Vim. In the config.yaml file, you will need to specify your pool's details, such as its name, ID, and keys. The following is an example of a config.yaml file for a pool named "My Pool":
      Code snippet
      relay-node:
        host: "localhost"
        port: 3001
      
      block-producer-node:
        host: "localhost"
        port: 3002
      
      pool-key: "/path/to/pool/key.skey"
      
      vrf-key: "/path/to/vrf/key.skey"
      Use code with caution. Learn more
- To start the relay node, you can use the following command:
      Code snippet
      cardano-node \
        --config config.yaml \
        --relay-node \
        --host localhost \
        --port 3001 \
        &> relay.log &
      Use code with caution. Learn more
- To start the block producer node, you can use the following command:
     Code snippet
     cardano-node \
       --config config.yaml \
       --block-producer-node \
       --host localhost \
       --port 3002 \
       --pool-key /path/to/pool/key.skey \
       --vrf-key /path/to/vrf/key.skey \
       &> block-producer.log &
     Use code with caution. Learn more
- Once you have completed these steps, your Cardano block producer node will be deployed and running. You can check the status of your block producer node using the following command:
      Code snippet
      cardano-cli query block-producer-status
      Use code with caution. Learn more
- This will output the following information:
Code snippet
Name: My Pool
Pool ID: xyz1234567890abcdefghijklmnopqrstuv
Status: Running
Active Stake: 1000000000 lovelace
Last Epoch: 1234567890
Last Slot: 1234567890
  • I think I can update the files for the block producer in the cloned repo.
  • Ok, lets set up a new directory in ap-southeast2 for our block producer, making a copy of node-relay-aws-1
  • Lets update configuration.nix
# I update the following service name and description.
  systemd.services.cardano-node-block-producer-daemon = {
    enable = true;
    description = "Cardano block producer daemon";
# Need to figure if I need to pass flags into the nix run on the ExecStart line
    serviceConfig = {
      ExecStart = "${pkgs.nix}/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run";
# New tailscale auth key
# Note I destroyed all existing keys and set up a new one for this. We have been very lax with keys will harden when we move out of sandbox
      ${tailscale}/bin/tailscale up --ssh -authkey tskey-auth-xxxxx
# Update hostName
# TODO: parhaps go back to relays and add "-r-" to indicate relay, adding "-bp-" to this one for now.
  networking.hostName = "aws-ap-se-bp-2-1";
  • For start_node.sh I am going to comment out nix build and systemctl start so I can manually play with them.
  • And updates to main.tf:
# I am going to completely remove 22 whitelisting from security_groups:
-     ingress {
-         from_port   = 22
-         to_port     = 22
-         protocol    = "tcp"
-         cidr_blocks = [ "xx.xx.xx.xx/32" ]
-     }
# TODO: Do we want to allow tls_ssh keys on block producers? Only leave ssm access? Leaving it alone for now
# Do we want to give block producers their own subnet? interconectivity is through tailscale so cant see downside of seperation. Leaving it for current testing
# TODO: Decide on relay vs bp subnets
  • Leaving terragrunt.tf alone, it only contains instance_type
  • Lets init our new directory:
terragrunt init
  • Lets double check we are in the right region
aws configure get region
  • Yup it returns:
ap-southeast-2
  • And lets deploy:
terragrunt apply
  • We suddenly have order of operation issue:
   on .terraform/modules/deploy_nixos/deploy_nixos/main.tf line 129, in locals:
│  129:   ssh_private_key      = local.ssh_private_key_file == "-" ? var.ssh_private_key : file(local.ssh_private_key_file)
  • To get around this I am going to touch and chmod the file to exist.
  • TODO: figure out why this is required for this apply
touch ./id_rsa.pem
chmod 600 ./id_rsa.pem
  • Lets try again:
terragrunt apply
  • Oops without the 22 whitelisting the deploy_nixos can not get on the machine to deploy
module.deploy_nixos.null_resource.deploy_nixos: Still creating... [1m30s elapsed]
╷
│ Error: file provisioner error
│
│   with module.deploy_nixos.null_resource.deploy_nixos,
│   on .terraform/modules/deploy_nixos/deploy_nixos/main.tf line 165, in resource "null_resource" "deploy_nixos":
│  165:   provisioner "file" {
  • For now I am going to add whitelisting for my ip back in
  • Ok it is up, unexpectedly it still started the service, I do have enabled set to true, but thought that setting did not allow autostart
  • TODO: Trace service startup so you can see what it is doing.
  • Ok I stopped the service, lets go look at our LLM steps and see if we can tease out configuration files we need and process for setting up.
  • I know me and Jack did this a while back, I can go look at that as backup, but trying to stay in current repo if I can help
  • mmm llm is a bit of a lyer might be more trouble than it is worth
  • Looking at developers.cardano.org I see there is 4 files involved in BP:
- Main Config: It contains general node settings such as logging and versioning. It also points to the Byron Genesis and the Shelly Genesis file.
- Byron Genesis: It contains the initial protocol parameters and instructs the cardano-node on how to bootstrap the Byron Era of the Cardano blockchain.
- Shelly Genesis: It contains the initial protocol parameters and instructs the cardano-node on how to bootstrap the Shelly Era of the Cardano blockchain.
- optional    Alonzo Genesis: It contains the initial protocol parameters and instructs the cardano-node on how to bootstrap the Alonzo Era of the Cardano blockchain.
- optional   Conway Genesis: It contains the initial protocol parameters and instructs the cardano-node on how to bootstrap the Conway Era of the Cardano blockchain.
- Topology: It contains the list of network peers (IP Address and Port of other nodes running the blockchain network) that your node will connect to.
  • In the repo I see:
ls -al configuration/cardano
  • Returns:
total 1184
drwxr-xr-x 3 root root    4096 May 18 15:36 .
drwxr-xr-x 6 root root    4096 May 18 14:29 ..
drwxr-xr-x 2 root root    4096 May 18 14:29 alonzo
-rw-r--r-- 1 root root    9459 May 18 14:29 mainnet-alonzo-genesis.json
-rw-r--r-- 1 root root 1056360 May 18 14:29 mainnet-byron-genesis.json
-rw-r--r-- 1 root root    2885 May 18 14:29 mainnet-config.json
-rw-r--r-- 1 root root    1657 May 18 14:29 mainnet-config-new-tracing.yaml
-rw-r--r-- 1 root root    8263 May 18 14:29 mainnet-config.yaml
-rw-r--r-- 1 root root      22 May 18 14:29 mainnet-conway-genesis.json
-rw-r--r-- 1 root root     284 May 18 14:29 mainnet-p2p-toplogy.json
-rw-r--r-- 1 root root    2486 May 18 14:29 mainnet-shelley-genesis.json
-rw-r--r-- 1 root root     128 May 18 14:29 mainnet-topology.json
{
  "Producers": [
    {
      "addr": "x.x.x.x",
      "port": 3001,
      "valency": 1
    }
  ]
}
  • *.configuration.json
# This was from the tutorial, not this requires magic because it refers to testnet
{
  "Protocol": "Cardano",
  "GenesisFile": "testnet-shelley-genesis.json",
  "RequiresNetworkMagic": "RequiresMagic",
# This is the same section in 2023/5 mainnet-configuration.json 
  "Protocol": "Cardano",
  "RequiresNetworkMagic": "RequiresNoMagic",
  "ShelleyGenesisFile": "mainnet-shelley-genesis.json"
  • it also update parameteres

This protocol version number gets used by block producing nodes as part of the system for agreeing on and synchronising protocol updates.You just need to be aware of the latest version supported by the network. You dont need to change anything here.

  • it configures Tracing

Tracers tell your node what information you are interested in when logging, such as switches that you can turn ON or OFF according the type and quantity of information that you are interesetd in. This provides fairly coarse grained control, but it is relatively efficient at filtering out unwanted trace output.

  • it allows fine grained logging control, I see the current file uses this setting for EKG metrics.

It is also possible to have more fine-grained control over the filtering of trace output, and to match and route trace output to particular backends. This is less efficient than the coarse trace filters above but provides much more precise control. options: mapBackends This routes metrics matching specific names to particular backends. This overrides the defaultBackends listed above. Note that it is an override and not an extension so anything matched here will not go to the default backend, only to the explicitly listed backends. mapSubtrace This section is more expressive, we are working on its documentation.

  • On my block producers I will nered to update mainnet-topology.json to:
nano testnet-topology.json
​
  {
    "Producers": [
      {
        "addr": "<YOUR RELAY NODE TAILSCALE IP ADDRESS>",
        "port": <PORT>,
        "valency": 1
      }
    ]
  }
  • On my relay nodes I will need to make the following updates:
{
  "Producers": [
    {
      "addr": "<YOUR BLOCK-PRODUCING NODE IP ADDRESS>",
      "port": <PORT>,
      "valency": 1
    },
    {
      "addr": "<OTHER RELAY NODE IP ADDRESS>",
      "port": <PORT>,
      "valency": 1
    },
    {
      "addr": "<OTHER RELAY NODE IP ADDRESS>",
      "port": <PORT>,
      "valency": 1
    }
  ]

  • I know how to pass in the flags I need!
nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --help
  • Returns:
Usage: cardano-node run [--topology FILEPATH]
                          [--database-path FILEPATH]
                          [--socket-path FILEPATH]
                          [ --tracer-socket-path-accept FILEPATH
                          | --tracer-socket-path-connect FILEPATH
                          ]
                          [--byron-delegation-certificate FILEPATH]
                          [--byron-signing-key FILEPATH]
                          [--shelley-kes-key FILEPATH]
                          [--shelley-vrf-key FILEPATH]
                          [--shelley-operational-certificate FILEPATH]
                          [--bulk-credentials-file FILEPATH]
                          [--host-addr IPV4]
                          [--host-ipv6-addr IPV6]
                          [--port PORT]
                          [--config NODE-CONFIGURATION]
                          [--snapshot-interval SNAPSHOTINTERVAL]
                          [--validate-db]
                          [ --mempool-capacity-override BYTES
                          | --no-mempool-capacity-override
                          ]
....
with lots more available options...

  • The following will run our node with /tmp/cardano-node.socket:
nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --socket-path /tmp/cardano-node.socket
  • Confirmation:
lsof /tmp/cardano-node.socket
COMMAND    PID USER   FD   TYPE             DEVICE SIZE/OFF  NODE NAME
cardano-n 7031 root   29u  unix 0xffff8f40099bee80      0t0 38007 /tmp/cardano-node.socket type=STREAM (LISTEN)
  • Can I have woot woot? Now on a relay we simple update topology.json start the node with our flags, get kes keys, update producer and we should be very close.
  • Lets see if we can query our node from cardano-node run:
nix run .#cardano-cli -- version
  • Lets query our node
  • First we set our node.socket env var with:
export CARDANO_NODE_SOCKET_PATH=/tmp/cardano-node.socket
  • Next lets see if we can find our current tip:
nix run .#cardano-cli -- query tip --mainnet
  • YAS!
{
    "block": 4267441,
    "epoch": 197,
    "era": "Byron",
    "hash": "568fb79a14b8e10b9811a7c8252a94c8ab7afa7a4f71c343adc6748ccb20b4b1",
    "slot": 4269593,
    "slotInEpoch": 14393,
    "slotsToEpochEnd": 7207,
    "syncProgress": "47.88"
}
  • Machine torn down, updates made to the flags of our service:
"${pkgs.nix}/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node//configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --config /cardano-node//configuration/cardano/testnet-config.json
  • Lets re-provision and confirm our node starts up
  • Grab the ip:
aws ec2 describe-instances --filters 'Name=instance-state-name,Values=running' --query 'Reservations[*].Instances[*].[InstanceId,PublicIpAddress]' --output text
  • Returns:
i-07518b843710xxxxxxxxxxxxxxx     13.xx.xx.77
  • Lets ssh:
ssh -i id_rsa.pem root@13.xx.xx.77
  • Next step is to update topology for both relay and producer and generate some keys implement rest of the “outstanding steps from buildCardanoStakePoolUbuntu.org”
  • I spent an hour troubhelshooting why terragrunt now fails if we do not pre-create the id_rsa.pem. I understand why it fails, but not how to fix it.
  • Adding manual for now.
touch id_rsa.pem
chmod 600 id_rsa.pem
  • OK, I pull in current configs update the config file-names to match our flake and that allows the tesnet relay to come up automatically
  • Section I added to start_node.sh
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-topology.json https://book.world.dev.cardano.org/environments/preprod/topology.json &
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-byron-genesis.json https://book.world.dev.cardano.org/environments/preprod/byron-genesis.json &
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-shelley-genesis.json https://book.world.dev.cardano.org/environments/preprod/shelley-genesis.json &
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-alonzo-genesis.json https://book.world.dev.cardano.org/environments/preprod/alonzo-genesis.json &
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-conway-genesis.json https://book.world.dev.cardano.org/environments/preprod/conway-genesis.json  &
/run/current-system/sw/bin/curl -o /cardano-node/configuration/cardano/testnet-config.json https://book.world.dev.cardano.org/environments/preprod/config.json &
# Fix paths set in official config.json
sed -i 's/conway-genesis.json/testnet-conway-genesis.json/g' /cardano-node/configuration/cardano/testnet-config.json
sed -i 's/alonzo-genesis.json/testnet-alonzo-genesis.json/g' /cardano-node/configuration/cardano/testnet-config.json
sed -i 's/byron-genesis.json/testnet-byron-genesis.json/g' /cardano-node/configuration/cardano/testnet-config.json
sed -i 's/shelley-genesis.json/testnet-shelley-genesis.json/g' /cardano-node/configuration/cardano/testnet-config.json
  • Confirmed it is working in ap-southeast2 lets do the same in eu-north1
  • The node isup, the topology file confirmed pointing to preprod, lets connect
export CARDANO_NODE_SOCKET_PATH=/tmp/cardano-node.socket
nix run .#cardano-cli -- query tip --testnet-magic 1
  • Returns:
{
    "block": 223797,
    "epoch": 30,
    "era": "Babbage",
    "hash": "7510ea7eba8e63764dd5c14689a47fa15cea0875adf0fd311ed394c5f7746e91",
    "slot": 11618608,
    "slotInEpoch": 300208,
    "slotsToEpochEnd": 131792,
    "syncProgress": "43.01"
}

[root@aws-eu-n-1-nr-1:/cardano-node]
ssh -i id_rsa.pem root@xx.xx.xx.xx
  • Lets create directory to store everythhing in:
mkdir -p $HOME/cardano-testnet/keys
cd $HOME/cardano-testnet/keys
  • Lets generate the payment pair:
cd /cardano-node/
  nix run .#cardano-cli -- address key-gen \
      --verification-key-file $HOME/cardano-testnet/keys/payment.vkey \
      --signing-key-file $HOME/cardano-testnet/keys/payment.skey
  • Create new stakepool address pair:
nix run .#cardano-cli -- stake-address key-gen \
 --verification-key-file $HOME/cardano-testnet/keys/stake.vkey \
 --signing-key-file $HOME/cardano-testnet/keys/stake.skey
  • Generate a wallet address for the payment key payment.vkey which will delegate to the stake address stake.vkey:
nix run .#cardano-cli --  address build \
  --payment-verification-key-file $HOME/cardano-testnet/keys/payment.vkey \
  --out-file $HOME/cardano-testnet/keys/payment.addr \
  --testnet-magic 1
  • With the output of the payment.addr I can see the address on chain and that it has no money in it.
nix run .#cardano-cli -- query utxo --address $(cat payment.addr)  --testnet-magic 1
warning: Git tree '/cardano-node' is dirty
                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
  • Next go register on faucet so we can get some ada to test with: https://docs.cardano.org/cardano-testnet/tools/faucet/
  • I did initially request the funds for preview instead of preprod, it did not give error, but I did not get funds
  • After I sennt the funds to the right network lets see:
nix run .#cardano-cli --  query utxo --address addr_test1vqvpj9hm86lnp7n3p5qjkm7df38av6k302rcx788f287uaqdrernh --testnet-magic 1
  • YAS, we have funds:
                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
86f4a5ffc63d317a7eaf28ca86819511ea3eeb90d7696faae6efd177c6cfe687     0        10000000000 lovelace + TxOutDatumNone
  • Create some cold keys and counter
nix run .#cardano-cli -- node key-gen \
    --cold-verification-key-file $HOME/cardano-testnet/keys/cold.vkey \
    --cold-signing-key-file $HOME/cardano-testnet/keys/cold.skey \
    --operational-certificate-issue-counter $HOME/cardano-testnet/keys/cold.counter
  • Generate KES keys
cd /cardano-node/
nix run .#cardano-cli -- node key-gen-KES \
    --verification-key-file $HOME/cardano-testnet/keys/kes.vkey \
    --signing-key-file $HOME/cardano-testnet/keys/kes.skey
  • Make a VRF key pair.
nix run .#cardano-cli -- node key-gen-VRF \
    --verification-key-file $HOME/cardano-testnet/keys/vrf.vkey \
    --signing-key-file $HOME/cardano-testnet/keys/vrf.skey
  • Update permissions on skey to set it to readonly
chmod 400 $HOME/cardano-testnet/keys/vrf.skey

#Current:

slotsPerKESPeriod=$(cat /cardano-node/configuration/cardano/testnet-shelley-genesis.json | jq -r '.slotsPerKESPeriod')
echo slotsPerKESPeriod: ${slotsPerKESPeriod}

slotNo=$(nix run .#cardano-cli -- query tip --testnet-magic 1 | jq -r '.slot')
echo slotNo: ${slotNo}
  • Boo:
jq: command not found
  • For now we will install in our session, for future we will add the package to our configuration.nix
nix-env -i jq
  • Lets try getting the slotNo again
  • Yas, this time we get:
 echo slotNo: ${slotNo}
slotNo: 29254618
  • Find kesPeriod by dividing the slot tip number by slotsPerKESPeriod.
kesPeriod=$((${slotNo} / ${slotsPerKESPeriod}))
echo kesPeriod: ${kesPeriod}
startKesPeriod=${kesPeriod}
echo startKesPeriod: ${startKesPeriod}
  • Returns:
echo kesPeriod: ${kesPeriod}
kesPeriod: 225

echo startKesPeriod: ${startKesPeriod}
startKesPeriod: 225
  • With this calculation, we can generate an operational certificate for the pool. Change the {startKesPeriod} in script from the value above accordingly.
cd /cardano-node/
nix run .#cardano-cli -- node issue-op-cert \
      --kes-verification-key-file $HOME/cardano-testnet/keys/kes.vkey \
      --cold-signing-key-file $HOME/cardano-testnet/keys/cold.skey \
      --operational-certificate-issue-counter $HOME/cardano-testnet/keys/cold.counter \
      --kes-period ${startKesPeriod} \
      --out-file $HOME/cardano-testnet/keys/node.cert
  • Lets set the env vars, see if we can run our node with keys:
KES=$HOME/cardano-testnet/keys/kes.skey
VRF=$HOME/cardano-testnet/keys/vrf.skey
CERT=$HOME/cardano-testnet/keys/node.cert
  • First lets stop the service that is currently running our node
systemctl stop cardano-node-block-producer-daemon.service 
  • Let us manually try to run our node with the new values:
nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key ${KES} --shelley-vrf-key ${VRF} --shelley-operational-certificate ${CERT}
  • It is coming up, it is looking healty, I can port-forward and see ekg stats.
  • Struggling to confirm it is running as Core node, but I do See “TraceNodeNotLeader” so that seems good enough for now
  • Next I will go register our stakepool https://developers.cardano.org/docs/operate-a-stake-pool/register-stake-address
  • Register Stake Address:
  • Query the UTXO of the address that pays for the transaction and deposit:
cd /cardano-node/ 
nix run .#cardano-cli -- query utxo \
    --address $(cat $HOME/cardano-testnet/keys/payment.addr) \
    --testnet-magic 1 > $HOME/cardano-testnet/keys/fullUtxo.out
  • Confirm
cat $HOME/cardano-testnet/keys/fullUtxo.out
  • I see the txHash and details
  • Find out current slot:
cd /cardano-node/ 
currentSlot=$(nix run .#cardano-cli -- query tip --testnet-magic 1 | jq -r '.slot')
echo Current Slot: $currentSlot
  • Create a stake address registration certificate
cd /cardano-node/ 
nix run .#cardano-cli -- stake-address registration-certificate \
    --stake-verification-key-file $HOME/cardano-testnet/keys/stake.vkey \
    --out-file $HOME/cardano-testnet/keys/stake.cert
  • Now, we build the transaction which will return the tx.raw transaction file and also the transaction fees:
cd /cardano-node
nix run .#cardano-cli -- transaction build \
      --tx-in 2c62f98035ee9c1a6da177d5aa5f69b82acb7134cae940dd35188a353f411ff8#0 \
      --tx-out $(cat $HOME/cardano-testnet/keys/payment.addr)+1000000 \
      --change-address $(cat $HOME/cardano-testnet/keys/payment.addr) \
      --testnet-magic 1 \
      --certificate-file $HOME/cardano-testnet/keys/stake.cert \
      --invalid-hereafter $(( ${currentSlot} + 1000)) \
      --witness-override 2 \
      --out-file $HOME/cardano-testnet/keys/tx.raw
  • Output:
Estimated transaction fee: Lovelace 172013
  • Go find the deposit amount in the protocol parameters
cd /cardano-node/
nix run .#cardano-cli -- query protocol-parameters \
      --testnet-magic 1  \
      --out-file $HOME/cardano-testnet/keys/protocol.json

stakeAddressDeposit=$(cat $HOME/cardano-testnet/keys/protocol.json | jq -r '.stakeAddressDeposit')
echo $stakeAddressDeposit
  • Returns:
2000000
  • Next, the complete transaction output is calculated by subtracting the deposit and transaction fees from the amount we have in our payment address:
txOut=$((10000000000-${stakeAddressDeposit}-172013))
echo ${txOut}
  • Now we have all the information in place to build the final transaction file:
cd /cardano-node/
nix run .#cardano-cli -- transaction build-raw \
      --tx-in 2c62f98035ee9c1a6da177d5aa5f69b82acb7134cae940dd35188a353f411ff8#0 \
      --tx-out $(cat $HOME/cardano-testnet/keys/payment.addr)+${txOut} \
      --invalid-hereafter $((${currentSlot} + 1000)) \
      --fee 172013 \
      --certificate-file $HOME/cardano-testnet/keys/stake.cert \
      --out-file $HOME/cardano-testnet/keys/tx.raw
  • Sign the transaction with both the payment and stake secret keys:
cd /cardano-node/
nix run .#cardano-cli -- transaction sign \
      --tx-body-file $HOME/cardano-testnet/keys/tx.raw \
      --signing-key-file $HOME/cardano-testnet/keys/payment.skey \
      --signing-key-file $HOME/cardano-testnet/keys/stake.skey \
      --testnet-magic 1 \
      --out-file $HOME/cardano-testnet/keys/tx.signed
  • Lets go submit our signed transaction
nix run .#cardano-cli -- transaction submit \
    --tx-file $HOME/cardano-testnet/keys/tx.signed \
    --testnet-magic 1 
  • Next Register a Stake Pool with Metadata
  • Create a json file with your metadata
{
    "name": "TestPool",
    "description": "The pool that tests all the pools",
    "ticker": "TEST",
    "homepage": "https://teststakepool.com"
}
  • Lets get the hash for our metadata file:
nix run .#cardano-cli -- stake-pool metadata-hash --pool-metadata-file $HOME/cardano-testnet/poolMetadata.json
  • Returns:
8292a9e45df8a72f975d6222690ce5bf3fe4a34ac72e272577dad47d075e7582
  • Lets Generate the stake pool registration certificate
  • NOTE: for the url below I took the permalink for a file I put in github into tinyrl because the permalink is too long for the cert
nix run .#cardano-cli -- stake-pool registration-certificate \
    --cold-verification-key-file $HOME/cardano-testnet/keys/cold.vkey \
    --vrf-verification-key-file $HOME/cardano-testnet/keys/vrf.vkey \
    --pool-pledge 10000 \
    --pool-cost 340000000 \
    --pool-margin 1 \
    --pool-reward-account-verification-key-file $HOME/cardano-testnet/keys/stake.vkey \
    --pool-owner-stake-verification-key-file $HOME/cardano-testnet/keys/stake.vkey \
    --testnet-magic 1 \
    --pool-relay-ipv4 16.16.199.77 \
    --pool-relay-port 41783 \
    --metadata-url https://tinyurl.com/yc5xxnke \
    --metadata-hash 8292a9e45df8a72f975d6222690ce5bf3fe4a34ac72e272577dad47d075e7582  \
    --out-file $HOME/cardano-testnet/keys/pool-registration.cert
sudo netstat -tulpen | grep cardano-node
tcp        0      0 0.0.0.0:41783           0.0.0.0:*               LISTEN      0          17293      2111/cardano-node   
tcp        0      0 127.0.0.1:12798         0.0.0.0:*               LISTEN      0          19353      2111/cardano-node   
tcp        0      0 127.0.0.1:12788         0.0.0.0:*               LISTEN      0          19352      2111/cardano-node   
tcp6       0      0 :::34509                :::*                    LISTEN      0          17294      2111/cardano-node   

Is that what my node is creating calling?

  • Lets confirm and see what is using port 41783
sudo lsof -i -P -n | grep 41783
  • Returns:
cardano-n 2111   root   32u  IPv4  17293      0t0  TCP *:41783 (LISTEN)
  • Lets see what we made:
cat $HOME/cardano-testnet/keys/pool-registration.cert
  • EYEEE, it looks like a certificate:
{
    "type": "CertificateShelley",
    "description": "Stake Pool Registration Certificate",
    "cborHex": "8a03581c045132653833613fb65c1bc9669d3fe617ad65361b2bc8490d51beee5820f91aec34939da643299bacaad2bcba7a0a140d6450c593238f8f57cf0d1ff0c71927101961a8d81e820101581de1a3fcba1fc8f9a73c05d114f79c465c2b62571d17d42fafcccc86557881581ca3fcba1fc8f9a73c05d11.....
}
  • To honor your pledge, create a delegation certificate:
cd /cardano-node/
nix run .#cardano-cli -- stake-address delegation-certificate \
      --stake-verification-key-file $HOME/cardano-testnet/keys/stake.vkey \
      --cold-verification-key-file $HOME/cardano-testnet/keys/cold.vkey \
      --out-file $HOME/cardano-testnet/keys/delegation.cert
  • Draft the transaction to submit the registration certificate to the blockchain:
  • NOTE: You can find the TxHash#TxIx by running “cardano-cli query utxo –address payment.addr”
cd /cardano-node/
  nix run .#cardano-cli -- transaction build-raw \
      --tx-in 300ab13a2539d437d9144fd35bbfbd3c74cc4825f8773f2da54bbcf81e74a1c4#0 \
      --tx-out $(cat $HOME/cardano-testnet/keys/payment.addr)+0 \
      --invalid-hereafter 0 \
      --fee 0 \
      --out-file $HOME/cardano-testnet/keys/tx.draft \
      --certificate-file $HOME/cardano-testnet/keys/pool-registration.cert \
      --certificate-file $HOME/cardano-testnet/keys/delegation.cert
  • Calculate the fees:
cd /cardano-node/
  nix run .#cardano-cli -- transaction calculate-min-fee \
      --tx-body-file $HOME/cardano-testnet/keys/tx.draft \
      --tx-in-count 1 \
      --tx-out-count 1 \
      --witness-count 3 \
      --byron-witness-count 0 \
      --testnet-magic 1 \
      --protocol-params-file $HOME/cardano-testnet/keys/protocol.json
  • Returns:
194761 Lovelace
  • I look up the pool deposit, I know there is a better way:
grep PoolDeposit $HOME/cardano-testnet/keys/protocol.json
   "stakePoolDeposit": 500000000,

  • Lets look up our UTxO Balance
nix run .#cardano-cli --  query utxo --address addr_test1vr5m5e9ws0mfwrc8hupp0ty9srzjfvjwagxqk6qa7vdfa7cvkkp98 --testnet-magic 1

Returns the hash we used in our build-raw above

                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
300ab13a2539d437d9144fd35bbfbd3c74cc4825f8773f2da54bbcf81e74a1c4     0        9997827987 lovelace + TxOutDatumNone
  • Calculate the change for tx-out:
expr <UTxO BALANCE> - <poolDeposit> - <TRANSACTION FEE>
  • So in our case
expr 9997827987 - 500000000 - 194761
  • Gives me:
9497633226
  • Build the transaction
cd /cardano-node/
  nix run .#cardano-cli -- transaction build-raw \
      --tx-in 300ab13a2539d437d9144fd35bbfbd3c74cc4825f8773f2da54bbcf81e74a1c4#0 \
      --tx-out $(cat $HOME/cardano-testnet/keys/payment.addr)+9497633226 \
      --invalid-hereafter $(( ${currentSlot} + 1000))  \
      --fee 194761 \
      --out-file $HOME/cardano-testnet/keys/tx.raw \
      --certificate-file $HOME/cardano-testnet/keys/pool-registration.cert \
      --certificate-file $HOME/cardano-testnet/keys/delegation.cert
  • Sign the transaction:
cd /cardano-node/
  nix run .#cardano-cli -- transaction sign \
      --tx-body-file $HOME/cardano-testnet/keys/tx.raw \
      --signing-key-file $HOME/cardano-testnet/keys/payment.skey \
      --signing-key-file $HOME/cardano-testnet/keys/stake.skey \
      --signing-key-file $HOME/cardano-testnet/keys/cold.skey \
      --testnet-magic 1 \
      --out-file $HOME/cardano-testnet/keys/tx.signed
  • Submit the transaction:
cd /cardano-node/
  nix run .#cardano-cli -- transaction submit \
      --tx-file $HOME/cardano-testnet/keys/tx.signed \
      --testnet-magic 1
  • Lets see if we can find our pool, first get the poolid
cd /cardano-node
  nix run .#cardano-cli --  stake-pool id --cold-verification-key-file $HOME/cardano-testnet/keys/cold.vkey --output-format "hex"
  • And lets look for the pool on the network ledger
cd /cardano-node/
  nix run .#cardano-cli -- query ledger-state --testnet-magic 1 | grep publicKey | grep 045132653833613fb65c1bc9669d3fe617ad65361b2bc8490d51beee
  • SOOO COOL. I would like to use pool-tool or other to get more details, but I see the id in the ledger, need to find more metadata.

Secrets:

  • TO get this done I need to:
  • Fix startup script for bp to use aws secrets
  • Fix topology between nodes, go read relay topology section, the talk about bp
  • Get visibility, viewGL.sh?, ekg (what else does it show?), grafana (what already exists), prometheus exporter, what do I get?
  • Curent process,
  • Lets set the env vars, see if we can run our node with keys:
KES=$HOME/cardano-testnet/keys/kes.skey
VRF=$HOME/cardano-testnet/keys/vrf.skey
CERT=$HOME/cardano-testnet/keys/node.cert
  • First lets stop the service that is currently running our node
systemctl stop cardano-node-block-producer-daemon.service 
  • Let us manually try to run our node with the new values:
nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key ${KES} --shelley-vrf-key ${VRF} --shelley-operational-certificate ${CERT}
  • So we set env vars and run daemen passing those in. How do you do that for a service?
  • Lets ask LLM for its opinion.
  • It suggests creating an environment file I then reference for the service
  • I like that idea as it allows us to itterate on the creation of that environment file and keys to be secure.
  • Lets see how we go, setting up file that looks exacly like KES, VRF, CERT file above
  • Updating service to something like this. I could never get environment file to work, decided to switch to script that sets vars)
[Unit]
Description=block-producer

[Service]
EnvironmentFile=/root/cardano/testnet/environment
ExecStart= nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key ${KES} --shelley-vrf-key ${VRF} --shelley-operational-certificate ${CERT}

[Install]
WantedBy=multi-user.target
  • Lets see what that looks like in our configuration.nix
  • I tried various incarnations declaring s3 json file setting that to /root/secret and a few others, but never got it to work right.
  • I think the right think would be to call key-store like aws or other for this.
  • To get around this for now I am going to add the env var setting and startup script to a sh I will call from ExecStart, that way terraform has a valid object and the vars can be set at runtime, I hope.
  • For now create static, add the following to start_node.sh
# Create control script for the node
cat << 'EOF' > /run/run_bp
source /root/cardano-testnet/environment
/run/current-system/sw/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key ${KES} --shelley-vrf-key ${VRF} --shelley-operational-certificate ${CERT}
EOF
  • At this point we have to manually craate the environment file:
export KES=$HOME/cardano-testnet/keys/kes.skey
export VRF=$HOME/cardano-testnet/keys/vrf.skey
export CERT=$HOME/cardano-testnet/keys/node.cert
  • We also create those files outside of the provisioning steps.
  • One way to move them off is to add them as aws secrets in terraform
  • Playing with this:
user_data = templatefile("${path.module}/start_node.sh", {
   KES = data.aws_secretsmanager_secret_version.testnet/kes.skey.secret_string
   VRF = data.aws_secretsmanager_secret_version.testnet/vrf.skey.secret_string
   CERT = data.aws_secretsmanager_secret_version.testnet/node_cert.secret_string
 })

  • Troubleshooting, lots of itteration to try to get tf to show me the variables with output.
  • Without sensitive = true it errors out trying to avoid exposing secrets, with it set to true the output is redacted in the apply
output "user_data_script" {
  sensitive = true
  value = aws_instance.machine.user_data
}
  • Finally learned on snarky stackoverflow, after the apply run:
terragrunt output user_data_script
  • It takes a minute, but shows me that my variables is not being substituted....
  • Solution, what I implemented after some itteration. At the end I will discuss some improvements.
  • I moved the command to run the node to a script of its own, I like modular script I can tweak for env vars etc without impacting node_start.sh
  • I do end up with the secrets written to files I use in my startup.
  • I used a remote executioner I added to aws_instance, I do not like this solution, it is crude.
  • It gives us something to itterate on, I need adult supervision for this implementation:
  • Here are the updates I ended up making to main.tf:
# This is first pass at pulling a secret out, will need to play
# I do not like that the secret arn goes here, but have not find a good way to keep the arn out of git.
data "aws_secretsmanager_secret_version" "kes_secret" {
  secret_id = "arn:aws:secretsmanager:ap-southeast-2:407250907589:secret:testnet/kes.skey-LR7hrJ"
}
#....etc for all 3

#This is inside resource "aws_instance"
  connection {
    type        = "ssh"
    host        = aws_instance.machine.public_ip
    user        = "root"
    private_key = file("${path.module}/id_rsa.pem")
  }
  provisioner "remote-exec" {
# I did lots of itteration with env vars I would then pass to user_data, but this was the only reliable way I found to access the secrets from the daemon start. I really want a cleaner way, but with these secured.
    inline = [
      "echo '${data.aws_secretsmanager_secret_version.cert_secret.secret_string}' > /run/node_cert",
      "echo '${data.aws_secretsmanager_secret_version.vrf_secret.secret_string}' > /run/vrf_secret",
      "echo '${data.aws_secretsmanager_secret_version.kes_secret.secret_string}' > /run/kes_secret",
    ]
  }
  
# This was a multi hour research gem on why deploy_nixos stopped running after I added the remote-exec, 
# Adding this depends_on seems to make deploy_nixos run every time I update the machine. Not sure if there is better way, 
# I know I can manually terraform -apply deploy_nixos, leaving this in for now.
+    depends_on = [aws_instance.machine]

  • In the node_start.sh I create a run_bp I will call from my configuration.nix service creation:
# Create control script for the node
cat << EOF > /run/run_bp
# source /root/cardano-testnet/environment
/run/current-system/sw/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key /run/kes_secret --shelley-vrf-key /run/vrf_secret --shelley-operational-certificate /run/node_cert


EOF

chmod 700 /run/run_bp
chmod 600 /run/kes_secret
chmod 600 /run/vrf_secret
chmod 600 /run/node_cert
  • And then I update my ExecStart in cardano serviceConfig section of configuration.nix to just call the run_bp we created.
    serviceConfig = {
-      ExecStart = "${pkgs.nix}/bin/nix run --accept-flake-config github:input-output-hk/cardano-node?ref=master run -- --topology /cardano-node/configuration/cardano/testnet-topology.json --socket-path /tmp/cardano-node.socket --port 6001 --config /cardano-node/configuration/cardano/testnet-config.json --shelley-kes-key ${KES} --shelley-vrf-key ${VRF} --shelley-operational-certificate ${CERT}";
 +     ExecStart = "${pkgs.bash}/bin/bash -c /run/run_bp";
 
  • TODO:
  • IMPROVEMENTS:
  • The aws secrets need to be env var or at least current file method needs to be vetted for better design and security

Additional information

Next steps:

Current

  • I still need some validation on what I did, but sandbox has a working testnet bp, rn and stake-pool
  • I have working producer that will start up based on secrets I set in aws.
  • Next I am going to split the work into a private yumi repo where I will be cleaning up cruft, hardening steps create README for final deploy.
  • I still also need to fix topology and create some monitoring.
  • Keep in mind ARM cost savings once we are running.

Archive of old steps, this is all done

  • Add final parts to make prod ready and move on to block-producer
  • The node gets started as a service we define in configuration.nix “systemctl status cardano-node-relay-daemon.service”
  • The new machine registers itself in tailscale, you can use tailscales to authenticate ssh over the 10. network, you can find machine by ip or networking.hostName
  • I can see healty logs with work the node is doing, I can see a dashboard with healty metrics, I still need to query and cli interact with the server
  • Document and test what we have, I see
  • The terragrunt research culmintated in restructure that allows us to spin up 3 relay-nodes each checking in a different region, each configured with tailscale
  • We are now in the process of re-applying what we overlap on a block-producer
  • Researh shows we have the files we need to apply this change to in input-output-hk/cardano-node/configuration/cardano
  • I figured out how to pass in the flags we need. – duh, now I can create and set the node-socket and even qury our node
  • Set up configs to auto provision 3 relays and a bp node to use testnet in sandbox
  • Next step is to update topology for both relay and producer and generate some keys implement rest of the “outstanding steps from buildCardanoStakePoolUbuntu.org”
  • Continue block producer tutorial here: https://developers.cardano.org/docs/operate-a-stake-pool/block-producer-keys#stakepool-operational-certificate-generation
  • Worked through that link and now I have a block producer with a registered stake pool.
  • Things to keep in mind:
- Still need to figure out how we set configurations for the node ie whitelist block producer etc
- iptables
- networks in aws, do we allow any internal communication through aws or will everything flow through tailscale?
- still need to think about key management, aegis/sops

Troubleshooting tips

  • Don’t forget to do tcpdump to see where it is trying to get artifacts from
  • Things to keep in mind:
- iptables
- network groups in aws
- still need to think about key management