Thoughts on Security Best Practices for Cosmos & IRISnet
From HashQuark and SlowMist
We’ve been working jointly to develop the solution to secure Cosmos & IRISnet validation nodes and their smooth deployments, and here are some main security issues we are trying to address:
DDoS attacks to Validator node
Intrusion attacks to Validator node
Software/hardware shutdown concerning Validator node
Leakage of the Validator node private key
The Recommended Architecture
Recommendations to Address DDoS Attacks
The Validator node shall be deployed to the internal network, and the Sentry node can be used to synchronize the blocks and verify transactions. Deploy a same Validator node as a backup for hot standby, and this backup node will be activated once the master Validator node fails. Plus, the Sentry node shall be run via VPN in case that the Sentry node won’t be generating blocks once being attacked or intruded.
A node log monitoring platform shall be in place for collection, analysis and visualization.
To keep track of the key parameters of all node servers including CPU load, disk I/O, network I/O, and the process number.
To keep track of
- Consensus height
- Consensus validators power
- Consensus rounds
- Consensus missing validators
Good Practices to Keep Nodes against Intrusions
Make sure the host is deployed with single service, and start the node-related process alone. Avoid using the host for multiple purposes.
Prevent the private Sentry from being scanned or located through the entire network, and modify the synchronization port 26656 (same for RPC port) to the maximum number of ports 80, 443 or 22 on the entire network. Thus attackers may find it much more costly to locate the private Sentry.
Disable other non-relevant service ports, open the required ones only, and execute strict security rules on AWS or Google Cloud. (This is required for both the cloud server and the operating system firewall)
Change the default 22 port of SSH. Configure SSH to allow login with the encrypted key only, disable login with password, and ensure the SSH port is only accessible to our operation and maintenance IP.
If there is sufficient budget, it is also recommended that excellent HIDS shall be deployed — the open-source OSSEC practices can also be referred to — to respond to server intrusion in a timely manner.
How to Avoid Software/Hardware Unexpected Shutdowns
An off-site server room is required for the purpose of remote disaster recovery.
Standby redundancy is needed for power supply at the server room.
Standby redundancy is needed for key hardware with high losses.
Safeguard of Private Keys
Security of the private key is of the greatest importance, and it is thus advised that the private key shall be run by the hardware.
- Protection of the Validator Key
The private key is used for consensus signature, and any leakage may lead to double signing. We recommend KMS, which is in the Alpha phase now. It is developed by Tendermint official and has the following advantages:
- It offers highly efficient and trustworthy access to the Validator Key
- It ensures the protection of double signing
- It ensures the security of the private key even if the Validator node has been hacked.
2. Protection of the Account Key
The Account Key is of the greatest significance. It is used to
- Vote for the consensus proposal
- Submit a new consensus proposal
- Perform all assets-related practice
We advise using Ledger Nano S officially recommended by Cosmos to protect the Account Key.
The switch from the Master Validator node to the backup Validator node still relies on human judgment, and misjudgments may lead to double signing.
Auto-bonding and auto-manage proposals are not supported by Ledger Nano S.
With very few Sentry nodes connected, they are prone to hijacks, and thereby preventing the Validator node from performing consensus.
A delegate practice instead of auto-bonding is permitted to assign the reward address to the cold wallet.
max_num_outbound_peers =number of the configured Sentry
persistent_peers = Sentry node IP address
pex = false (About PEX mode)
Turn off RPC port
Turn off RPC port
persistent_peers = Validator node IP address
A private Sentry is needed so that the Validator affords to generate blocks even if the denial of service arises due to the heavy public Sentry traffic.
3. Non-root boot gaiad
It is advised that a common user account shall be created to start the gaiad process once the compilation is completed. A non-root boot is recommended to reduce the risks.
4. Scripts deployment for block monitoring
To ensure alerts sent and the standby server automatically switched to once the Validator node is found to have stopped generating blocks.