IDEX Staking Node Ansible Role

Description and motivation behind the development of the Ansible Role to setup an IDEX staking node.

IDEX Staking Node Ansible Role

I decided to start contributing to start participating in some crypto projects. I stumbled upon IDEX and found it interesting enough. I liked the idea that external participants are able to collaborate with the market itself in some capacity. This, for the time being, is restricted to being a Tier 3 Staker; which basically keeps track the trading history and provides it to the IDEX user client.

The first thing to do in order to set it up is have the minimum number of IDEX tokens (as the time of writing that is 10k IDEX token) and having held them for a period of 7 days.

The pains of following the instructions

My initial attempt to setup the host was to follow the instructions in the IDEX Github.

I had some issues installing the @idexio/idexd-cli that after googling a bit was able to overcome. But once I got it working is when the biggest issues started happening.

The parity node that is bootstrapped through the docker-compose file ended up hanging at random times, making the staking node hang. They recommend using Infura in order to avoid this, which I've found it really helps.

The other thing that I've found is that, for some reason, the IDEX staking node would consume 100% CPU even though in the logs it would be basically saying "waiting for new blocks".
I wanted to see if it was possible to limit the CPU so as not to be burning my resources, but couldn't find a way.

In any case, the most painful part of all this is that everything is really manual and when I tried to automate it I've found a lot more issues. All the CLI is not designed to be used automatically.

Dig in, keep the important stuff and discard the rest

Up until that moment I hadn't looked into how the actual staker was setup I decided to see how the actual thing worked to see if it was possible to make it more automation friendly.

The first thing I found out was that all the CLI could be completely removed without losing any of the core functionality. The most important thing that the CLI does is to generate the settings.json that is then mounted in the staking node docker container.

I basically looked into CLI to see how it launched the containers and found that the key thing was this docker-compose.yml file.

...
services:
    parity:
        image: parity/parity:stable
        env_file: aurad_config.env
        volumes:
            - parity:/eth
        ...
    mysql:
        image: mysql:5.7
        env_file: aurad_config.env
        ...
    idexd:
        image: idexio/idexd:0.2.0
        depends_on:
          - "mysql"
        volumes:
          - type: bind
            source: ${HOME}/.idexd/downloads
            target: /usr/idexd/downloads
          - type: bind
            source: ${HOME}/.idexd/ipc
            target: /usr/idexd/ipc
        stop_signal: SIGINT
        stop_grace_period: 20s
        command: ["start", "pm2.config.js", "--no-daemon", "--only", "worker", "--kill-timeout", "5000"]
        ...
        ports:
            - "8080:8080"
            - "8443:8443"
        env_file: aurad_config.env
        environment:
          - RPC_HOST
          - RPC_PROTOCOL
          - RPC_PORT
          - STAKING_HOST
          - SSL_PRIVATE_KEY_PATH
          - SSL_CERT_PATH
        ...

I used this as my foundation to launch a docker container using the idexio/idexd image and passing the corresponding environmental variables, mounting appropriate volumes and using a separate instance of MySQL.

That worked well and would be fairly straight-forward to automate and started doing so in Ansible until I started finding that the staking node would hang and not resume.
This would happen when my box would lose connectivity for some time; I was running my staking node in my home server and my router had been not working properly at the time. It would basically print some message saying that there had been a timeout or a health-check was missed and then stayed on that state forever.

Because of that is that I started looking into how the staking node server was being run.
The more I looked into it, the more I realized that I could remove layers.
Something that my experience in software engineering has taught me is that the more things you add to the project the more places for an error to happen are.

By looking at the Dockerfile and the command used in the docker-compose, I realized they were using pm2 to launch the staking node.
I don't have any experience with node.js but based on looking at the pm2 site, it is basically a process manager for node applications. In the case of this project it seems to be used as a watchdog to restart the staking server in case of failure. I would like to know why was this used when you can trust docker to restart the container automatically in case of failure. But this seemed to be the cause of my server getting stuck after losing connectivity for some time. So I decided to see how to remove this.

It didn't take much effort to understand how pm2 was setup to launch the staking node server. Looking at the pm2.config.js it is obvious that it is just launching (using node, of course) the lib/index.js file relative to the container's WORKDIR (/usr/idexd/). I just then replaced the entry-point of the container to use nodejs lib/indexjs. That solved my stability issues forever.

Now that I had the most basic and simple way of launching a staking node, I went for the setup automation.

Writing the Ansible role

I have recently started to manage my "personal infrastructure and services" using Ansible. Decision that was taken after I had to ditch a server and it took me forever to setup everything back again. So I decided that this was a great opportunity to write a role for it.

I already knew what I had to do and what was that I needed:

  • Setup a set of folders to mount on the container
  • Copy a given configuration file to one of those folders
  • Launch a docker container mounting the appropriate folders, setting up the correct environmental variables and publishing the right port.

Writing the role didn't take long as it didn't have to do any weird things.
Most of the things are assumed to have been already setup before (MySQL and access to an Ethereum API) and the role just takes variables in order to pass the container. One extra detail I added was to add a variable that allows control of CPU allocation; I added it being scared of running into high CPU usage but seems that this was also caused by using pm2.

I published the on Galaxy: salessandri.idex_staking_node.

Even though I had published it, there was still something that was bothering me. In order to set up the staking node, it still required generating a settings.json and the only way of doing so was by downloading the idex-cli and installing it was the most annoying part of the process. Apart that it pollutes your home directory in order to do so. I said to myself: "Challenge accepted, I'm gonna simplify this".

Simplifying the settings generation

I dig into the config command provided by the idex-cli.
The source code is located in the aurad-cli/src/commands/config.js file of the repo.

Even without being a javascript nor a nodejs ninja I was able to figure out what it does which basically boils down to:

  1. Send a GET request to https://sc.idex.market/wallet/<cold wallet address>/challenge and get what is contained message field of the JSON response.
  2. Ask the user to sign using the cold wallet account the message from step 1.
  3. Locally verify that the signature actually was generated by the cold wallet.
  4. Create a new Ethereum account.
  5. Send a POST request to the same URL as step 1 but passing the public address of the account generated in step 3 as well as the signature received from the user in step 2.
  6. Encrypt the newly created Ethereum account with a randomly generated 16-byte token.
  7. Generate the settings file containing:
    1. The cold wallet public address.
    2. The token used to encrypt the new account.
    3. The encrypted account.

Writing a python script that can be easily run in a virtual environment didn't take long to emerge thanks to the usage of the requests and web3 libraries.

I added it to the Ansible role to make it easier for people to deploy it.

It still requires manual intervention as the signing needs to be done manually; I assumed no one would want to paste their private key in someone else's code (I wouldn't).

But at least it doesn't require to install the whole idex-cli and it is isolated.

I have to admit I haven't actually tried the generated settings in a production environment as I already have the settings from before and don't have any other wallet's that could be used for staking.

There are a couple of question that I would love to be answered by someone with the knowledge.

  1. Does IDEX keep track of the challenge given per cold-wallet? Or can a random challenge be generated and signed rather than using step 1?
  2. What happens if step 5 is executed multiple times with different hot wallet addresses? It might be relevant based on the answer for 1.

Comments powered by Talkyard.