Related to #1920, of which the quick solution is to remove the gossip_store file.
This is a follow up to see if I could find the cause, I could reproduce #1920 (hanging ping command) on my node connected on to another node on my LAN, by when putting back the corrupt gossip_file. I tried to do some gdb debugging, but having a hard time to find the relevant code-path and which daemon actually handles ping commands (openingd?) Also I am not so skilled with gdb.
Maybe someone can point me in the right direction?
So instead below is log-io output during the hanging ping command, using kill -10 pid and subsequently kill -6 pid for the trace.
Oddly, when I attached gdb to openingd, a few times the ping command did NOT hang on the first attempt after restart (because it waited for gdb's continue?)
./lightning-cli --lightning-dir=/home/simon-debian/.lightning_mainnet/ listpeers
{
"peers": [
{
"id": "0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f",
"connected": true,
"netaddr": [
"192.168.1.166:41322"
],
"global_features": "",
"local_features": "8a",
"channels": [
]
}
]
}
simon-debian@debian:~/git/SimonVrouwe/lightning/cli$ ./lightning-cli --lightning-dir=/home/simon-debian/.lightning_mainnet/ ping 0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f
^C
2018-09-11T12:56:49.448Z lightningd(12650):jcon fd 18:[IN] 7b20226d6574686f6422203a20226c6973747065657273222c2022696422203a20226c696768746e696e672d636c692d3132373432222c2022706172616d7322
2018-09-11T12:56:49.448Z lightningd(12650):jcon fd 18:[IN] 203a205b205d207d
2018-09-11T12:56:49.448Z lightningd(12650):jcon fd 18:[OUT] 7b20226a736f6e727063223a2022322e30222c2022726573756c7422203a200a7b0a2020227065657273223a205b0a202020207b0a202020202020226964223a2022303231356633613735386435646331313039313862343664376661313762373039323430386238613437623338373930626361643032666464373062323236643566222c200a20202020202022636f6e6e6563746564223a20747275652c200a202020202020226e657461646472223a205b0a2020202020202020223139322e3136382e312e3136363a3431333232220a2020202020205d2c200a20202020202022676c6f62616c5f6665617475726573223a2022222c200a202020202020226c6f63616c5f6665617475726573223a20223861222c200a202020202020226368616e6e656c73223a205b0a2020202020205d0a202020207d0a20205d0a7d2c2022696422203a20226c696768746e696e672d636c692d313237343222207d0a
2018-09-11T12:56:57.745Z lightningd(12650):jcon fd 18:[IN] 7b20226d6574686f6422203a202270696e67222c2022696422203a20226c696768746e696e672d636c692d3132373433222c2022706172616d7322203a205b20
2018-09-11T12:56:57.745Z lightningd(12650):jcon fd 18:[IN] 22303231356633613735386435646331313039313862343664376661313762373039323430386238613437623338373930626361643032666464373062323236
2018-09-11T12:56:57.745Z lightningd(12650):jcon fd 18:[IN] 643566225d207d
2018-09-11T12:56:57.746Z lightning_gossipd(12661): sending ping expecting response
2018-09-11T12:57:10.743Z lightningd(12650): ... feerate estimate for slow hit floor 253
2018-09-11T12:57:40.798Z lightningd(12650): ... feerate estimate for slow hit floor 253
2018-09-11T12:57:51.282Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: SIGUSR1: IO LOGGING ENABLED
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 01024943cb8a409fa6aa2c54de753cc72d6f850daaa5df071d3b7bcba63a6077a7511cef2d06b299e2573f973766333da939c71eb6d339f9a47ca4bf551154c6f2f16fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d6190000000000083ee80003b000005b94c332000000900000000000000001000003e800000064
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 01022b4d2fede5f0d810d72d59e500d3fd760e1c0d3fc9692d3e5c61f5344c92ff8901a36bdf68cfe6ef0f5873fe95065f8be8f9b3f56faadbe61f30933c3039d6406fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d6190000000000083f4200014300015b94c4130001009000000000000003e8000003e800000064
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 01026801c012ceb128ab8def32384797bb90d58c0d521f93d0cef1833f712c8c9f743d901d0d0633a3f0dcf61b95b84d5d06570dacf822ffab4771c0baf0b56249066fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d6190000000000083ee80003b000005b94c33b0001009000000000000000010000006400000001
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 0102e09c308928d488c67395357ba5fe55d9efeca377446fba3a6dcd2bb8485b32ef637fcb3fd761a322eeea98a1a42b4a5964a79e4dcd0eb2656c35338c1559d3906fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d6190000000000083f4300038000015b94c338000000900000000000000000000003e800000064
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 010160d960390c78de22410479f56ccfece617e9e982cd679b4344b4aa195aa68f5921c28bae2439895a749fb417363643592ee081b690a92fc771cee1e405e0599500005b94163002dc0d61ab682019d46a56e705b3fa31a9faa8bb7fe090afaf8622eecd228c779149daaa4d6f746f736869000000000000000000000000000000000000000000000000000000
2018-09-11T12:57:51.283Z lightningd(12650):lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1:[OUT] 0100305c5b70a598e5eb3ba862c9f249fb37837e56b51245a70a20a7d17888afa55c427aad0cd72ef0986414aaaaeb7eb490c36a221d2985073d2fffb93d660a3af35d11fd4a0e68d8a22541c55c694ff483bd2848981ff7a0dcf1441d3ffefdef7a5e1673723c539dc1cbdee5e33eb42e00122e39aaaa7c325669209f0c64784b0aa6af964731fcb673c8184039070badc663306c3ef8662446c93ff753ae4f9653469e77b9988463c6761cc7ef7a710b775eed5b14eca87aa077ae1f530ef87cf9e173ebf7492026998ea16e48c461cf72499fae3a5c742e54078ba145f86ce8065b3298bac7ca516002806057c8cbb324460c9cad3b605118128cba5cc528e98600006fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d6190000000000083d6800035f0001026cb1948c30e99cccdcff155c7fafcd40fe1d31c0e867c4a3fdeae2cfe9c0eb6503ee180e8ee07f1f9c9987d98b5d5decf6bad7d058bdd8be3ad97c8e0dd2cdc7ba0394f8248e2137b78c010d25c22337848f9e07082d3a213d92c5e0b67c438c6c5902c118c0a6be106a004d54b45c0b3605e379cfe13726e5798fcfe47c7f3f48dd25
2018-09-11T12:58:10.854Z lightningd(12650): ... feerate estimate for slow hit floor 253
lightning_openingd: Fatal signal 6 (version v0.6.1rc1-47-gb52fb14)
0x56441495f2eb crashdump
common/daemon.c:37
0x7f8764770fbf ???
???:0
0x7f87648242a4 ???
???:0
0x56441498d5cf write_all
ccan/ccan/read_write_all/read_write_all.c:11
0x56441495e127 sync_crypto_write
common/crypto_sync.c:44
0x564414965666 handle_gossip_msg
common/read_peer_msg.c:95
0x56441495c226 handle_gossip_in
openingd/openingd.c:1020
0x56441495c7e6 main
openingd/openingd.c:1158
0x7f876475db16 ???
???:0
0x564414959a79 ???
???:0
0xffffffffffffffff ???
???:0
2018-09-11T12:58:20.264Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: SIGUSR1: IO LOGGING DISABLED
2018-09-11T12:58:20.264Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: FATAL SIGNAL 6 (version v0.6.1rc1-47-gb52fb14)
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: common/daemon.c:42 (crashdump) 0x56441495f33f
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: (null):0 ((null)) 0x7f8764770fbf
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: (null):0 ((null)) 0x7f87648242a4
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: ccan/ccan/read_write_all/read_write_all.c:11 (write_all) 0x56441498d5cf
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: common/crypto_sync.c:44 (sync_crypto_write) 0x56441495e127
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: common/read_peer_msg.c:95 (handle_gossip_msg) 0x564414965666
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: openingd/openingd.c:1020 (handle_gossip_in) 0x56441495c226
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: openingd/openingd.c:1158 (main) 0x56441495c7e6
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: (null):0 ((null)) 0x7f876475db16
2018-09-11T12:58:20.265Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: (null):0 ((null)) 0x564414959a79
2018-09-11T12:58:20.266Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: backtrace: (null):0 ((null)) 0xffffffffffffffff
2018-09-11T12:58:20.266Z lightningd(12650): lightning_openingd-0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: STATUS_FAIL_INTERNAL_ERROR: FATAL SIGNAL
2018-09-11T12:58:20.266Z lightningd(12650): 0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f chan #1: Owning subdaemon lightning_openingd died (61952)
2018-09-11T12:58:20.266Z lightning_connectd(12660): Forgetting peer 0215f3a758d5dc110918b46d7fa17b7092408b8a47b38790bcad02fdd70b226d5f
2018-09-11T12:58:25.869Z lightningd(12650):jcon fd 18: Abandoning command
2018-09-11T12:58:27.855Z lightningd(12650):jcon fd 18:[IN] 7b20226d6574686f6422203a202273746f70222c2022696422203a20226c696768746e696e672d636c692d3132373630222c2022706172616d7322203a205b20
2018-09-11T12:58:27.855Z lightningd(12650):jcon fd 18:[IN] 5d207d
2018-09-11T12:58:27.855Z lightningd(12650):jcon fd 18:[OUT] 7b20226a736f6e727063223a2022322e30222c2022726573756c7422203a200a225368757474696e6720646f776e222c2022696422203a20226c696768746e696e672d636c692d313237363022207d0a
2018-09-11T12:58:27.855Z lightningd(12650):jcon fd 18: JSON-RPC shutdown
2018-09-11T12:58:27.855Z lightning_connectd(12660): Shutting down
2018-09-11T12:58:28.856Z lightning_gossipd(12661): Shutting down
2018-09-11T12:58:29.856Z lightning_hsmd(12659): Shutting down
Related to #1920, of which the quick solution is to remove the
gossip_storefile.This is a follow up to see if I could find the cause, I could reproduce #1920 (hanging
pingcommand) on my node connected on to another node on my LAN, by when putting back the corruptgossip_file. I tried to do somegdbdebugging, but having a hard time to find the relevant code-path and which daemon actually handlespingcommands (openingd?) Also I am not so skilled with gdb.Maybe someone can point me in the right direction?
So instead below is
log-iooutput during the hangingpingcommand, usingkill -10 pidand subsequentlykill -6 pidfor the trace.Oddly, when I attached gdb to
openingd, a few times thepingcommand did NOT hang on the first attempt after restart (because it waited for gdb's continue?)