Self Managed Node Checklist
This guide indicates the steps that a Self-Managed member should take in order to investigate issues prior to contacting Contour support. This will improve the resolution times.
Verifying a service is running and responsive
Troubleshooting is the process of finding and eliminating the cause of a problem. When you have a problem with your contour application node , the troubleshooting process begins as soon as you ask yourself what happened?
A basic troubleshooting strategy:
-
Try to login in contour application.
-
Check the Float service up and running (if applicable ) .
ps -ef | grep float
- Check the bridge service up and running (if applicable ).
ps -ef | grep bridge
- Check the corda application up and running .
ps -ef | grep corda
- Check the web application up and running .
ps -ef | grep contour-service
- Check if there any stuck flow in corda shell. Login in corda shell using below command where user name is your user name form node.conf
ssh -p <port> <username>@<hostname>
For Example - ssh -p 2222 corda@localhost
Then execute
run stateMachinesSnapshot
If the result is empty list [], it means all flows have completed.
This will provide the list of all stuck flow in your node.
Network Connectivity
Contour requires that your node is able to connect directly with the other nodes involved in the transaction as well as Contour's BNO node. If there is a network connectivity issue then transactions may fail. Such issues are generally caused by your network components (eg. firewalls, Corda float, proxy) or your counterpart's network blocking traffic.
telnet command
Check the p2p connectivity by telnet command
telnet <Node P2P URL> <Node P2P Port Number>
Example output as below.
telnet bno.p2p.app.contournetwork.io 10011
Trying 20.197.118.105...
Connected to bno.p2p.app.contournetwork.io.
Escape character is '^]'.
You can get the peer node p2p address and port from Corda Network Map node info via the Production Network and Pre Production Network.
curl command
Check the network coneectivity to peer node and Corda CNF services * Corda Notaries * Corda Network Map * Corda Network Doorman * Certificate Revocation List(crl) End point
Details are available at Network Connection Page.
curl check with proxy setup
curl -v -x <Proxy:Port Number> -L <Node P2P URL:Node P2P Port Number>
curl check without proxy setup
curl -v -L <Node P2P URL:Node P2P Port Number>
Example output as below.
curl -v -L prod-notary-sub0-uks1-01.corda.network:10002
* Rebuilt URL to: prod-notary-sub0-uks1-01.corda.network:10002/
* Trying 51.143.163.127...
* TCP_NODELAY set
* Connected to prod-notary-sub0-uks1-01.corda.network (51.143.163.127) port 10002 (#0)
> GET / HTTP/1.1
> Host: prod-notary-sub0-uks1-01.corda.network:10002
> User-Agent: curl/7.58.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host prod-notary-sub0-uks1-01.corda.network left intact
curl: (52) Empty reply from server
Execute a Ping Test
Contour has developed a ping tool to enable node operators to check the connectivity between 2 nodes. Executing this test is perfectly safe, it does not create any business transactions, nor modify any data. You should execute this check against Contour's BNO node first before executing against another node.
To run the Ping Test Please follow stpes
- Login in corda shell using below command
ssh -p <port> <username>@<hostname>
For Example - ssh -p 2222 corda@localhost
- Then execute Below command in corda shell. It will try to ping
Contour BNO
If the network connections have configured properly, thePing
flow will comeplete successfully with example output as below.
Mon Dec 14 03:34:07 UTC 2020>>> flow start network.contour.app.flow.SinglePingFlow$Initiator target: "OU=Contour BNO, O=Contour Pte. Ltd., L=Singapore, C=SG"
✓ Starting
✓ Start the ping.
✓ Receive the ping.
▶︎ Done
Flow completed with result: OU=Contour BNO, O=Contour Pte. Ltd., L=Singapore, C=SG.name has received ping at 2020-12-14T03:36:10.853Z
ping target OU=Contour BNO, O=Contour Pte. Ltd., L=Singapore, C=SG time elapsed = 97 ms
Basic checks for SM node Operator if Bridge blocking the connection with BNO .
- Are you able to do a telnet test/ nc test to ping from the node to the BNO node?
- Can you confirm if there is a https proxy behind the bridge that is sending traffic out ?
- Are you able to take a look or send us the firewall of the node’s firm logs ? You maybe see in there TCP connection being established to BNO “build TCP connection” and then immediate “Tear down”
- Can you also confirm that when you started the Node - you started it in the following order ?
- startup float first
- startup the node next. wait for it to completly come up
- startup the contour API
- last start up the bridge
- Please Confirm thethe patch version of corda and contour API.
- If you are using the external artemis broker for the bridge , Are you able to retrieve those logs ?
If you are getting below kind of error
WARN: AMQ212054: Destination address=examp is blocked. If the system is configured to block make sure you consume messages on this configuration.
There are multiple reasons that the broker might block messages from being sent to an address:
1- If the <address-full-policy> is BLOCK and the address has reached the configured <max-size-bytes>.
2 - If the <address-full-policy> is BLOCK and the <global-max-size> for all addresses is reached.
3 - If the <max-disk-usage> is reached.
- Please check the disk utilitzation on your prime bridge component server / Active MQ .
- Can you check if there is a thread dump created on prime bridge server ? (it may have crashed) Please check the disk space usage on the prime Bridge and HA bridge.
Health Survey Tool
The Health Survey Tool is a command line utility that can be used to collect information about a node, which can be used by the R3 support team as an aid to diagnose support issues. It works by scanning a provided node base directory and archiving some of the important files. Furthermore, it does a deployment status check by connecting to the node and probing it and the firewall (if deployed externally) for information on configuration, service status, connection map and more.
Example to use health survey tool to check the node health
java -jar corda-tools-health-survey-4.8.6.jar
Corda Health Survey Tool 4.8.6
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✔ Reporting to file /opt/corda/report-20220323-055921.zip
✔ Node is not configured to use external bridge, firewall validation will be skipped
✔ Collected machine information
✔ Collected information about Corda installation
✔ Collected network parameters
✔ Collected node information file
✔ Collected additional node information files
✔ Collected CorDapp information
✔ Collected censored node configuration
✔ Collected driver information
• Identity Manager status endpoint https://doorman.uat.corda.network/3FCF6CEB-20BD-4B4F-9C72-1EFE7689D85B/status returned response
• Identity Manager status endpoint https://doorman.uat.corda.network/3FCF6CEB-20BD-4B4F-9C72-1EFE7689D85B/status returned response
• Identity Manager status endpoint https://doorman.uat.corda.network/3FCF6CEB-20BD-4B4F-9C72-1EFE7689D85B/status returned response
• Network Map status endpoint https://uat-sub1-netmap-01.uat.corda.network/SUB1CEP8-32UX-6ZXK-9C82-1FLR6268D75Z/status returned res
• Network Map status endpoint https://uat-sub1-netmap-01.uat.corda.network/SUB1CEP8-32UX-6ZXK-9C82-1FLR6268D75Z/status returned res
• Network Map status endpoint https://uat-sub1-netmap-01.uat.corda.network/SUB1CEP8-32UX-6ZXK-9C82-1FLR6268D75Z/status returned res
✔ Collected general network information
✔ Collected log files
✘ Configuration files contain plain text passwords
✔ Verified RPC communication
✔ Connected to Artemis Broker
✔ Initialised tool serialization context
✔ Node network settings are valid
✔ Echo test complete, message count is 1
✔ Runtime info collected
✔ Exported report to /opt/corda/report-20220323-055921.zip
Example to use health survey tool to check the notary connection
java -jar corda-tools-health-survey-4.8.6.jar -n
Corda Health Survey Tool 4.8.6
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
✔ Reporting to file /opt/corda/report-20220323-055957.zip
✔ Node is not configured to use external bridge, firewall validation will be skipped
✔ Getting notary list
✔ Getting network map
✔ Connected to Artemis Broker
✔ Verified connection to CN=uat-sub1-notary-01, O=R3 HoldCo LLC, L=New York, C=US
✔ Verified connection to CN=uat-sub1-notary-05, O=R3 HoldCo LLC, L=New York, C=US
✔ Verified connection to CN=uat-sub1-notary-03, O=R3 HoldCo LLC, L=New York, C=US
✔ Exported report to /opt/corda/report-20220323-055957.zip
A report has been generated and written to disk.
Path of report: /opt/corda/report-20220323-055957.zip
Size of report: 2.3 KiB
Note: please comment useSsl = true
in node.conf when start the corda service if you plan to use health survey tool. Remember uncomment it and restart the service after complete the checking.
Please refer to Corda Health Survey for more details.
- To download the jar file send a support request via the contour support portal.
Self manage nodes daily health check
Recommendation for self-managed node daily health check. * Monitor the stuck flow of the corda node (steps provided under the Verifying a service is running and responsive ). * Ping Flow test with BNO (steps provided under network connectivity section).
Self manage nodes monitoring
Recommendation for Processes to be monitored * Corda Node * Corda Firewall (Float, Bridge) * Artemis if applicable * Contour Api service
Application alerts keywords
* Fatal - This is more appropriate at the moment
* been waiting for - Stuck Flow alert keyword, for example: statemachine.FlowMonitor. - Flow with id bbced692-f5d9-46e2-ba2a-eb4ea7d2d757 has been waiting for 250989 seconds to receive messages from parties [OU=Trade Finance, O=Global Trade Network, L=Singapore, C=SG].
Raising a Ticket
If the above steps have not identified the issue, then please raise a support ticket. Please always include
- Screenshots that highlight the error
- Logs. Please include the following logs
- Corda Logs -
from the path /opt/corda/logs/node-${hostName}.logs
- Contour WEB API logs -
from the path /opt/api-service/logs/node-${hostName}.logs
- Float logs -
from the path /opt/corda-float/logs/corda-firewall-${hostName}.logs
- Bridge Logs -
from the path /opt/corda-bridge/logs/corda-firewall-${hostName}.logs
- Corda Logs -
- Provide the stuck flow using the given below commands .
ssh -p <port> <username>@<hostname>
For Example - ssh -p 2222 corda@localhost
Then execute
run stateMachinesSnapshot
Save the output in txt file and sent to contour support .
- If issue is related to connectivity, please engage your infra/network team to validate infra rules or cofigurations first and attach the Firewall logs(Not corda Firewall) to the ticket as well.