Comment by π cipres
Re: "@flipperzero The hashnix.club misfin server is down, looksβ¦"
@flipperzero Still can't connect to hashnix's misfin, port 1958 (ipv6 or ipv4). Port 1965 works fine as always.
Apr 05 Β· 5 weeks ago
25 Later Comments β
π cipres [OP] Β· Apr 10 at 11:31:
Up and running now
π flipperzero Β· Apr 10 at 23:35:
@cipres can I send you a series of error logs and see if you can pinooint in case thereβs anything making the server crash? Iβve been noticing dropped service too, but then when I check my daemonβs status indicates that all is seemingly operational and itβs been confusing me as of this past week.
π cipres [OP] Β· Apr 11 at 11:15:
@flipperzero Yes, please send me the error logs (via misfin or mail), thank you. Are you seeing any "Blocking peer ..." messages ? If you have a gitlab account you can also create an issue and paste the logs there:
β https://gitlab.com/cipres/misfin/-/work_items
I'm thinking that maybe all of this could be related to the use of the LMDB database (for message stats), that was commited on March 10. I will disable it by default and push an update.
Thanks a lot @flipperzero
π flipperzero Β· Apr 12 at 09:26:
@cipres I've went ahead and installed your latest update, but with interest to best practice, I have some excerpt of the logs sent to your galacteek email as well as your hashnix misfin. Thank you again, cipres.
π cipres [OP] Β· Apr 12 at 14:27:
@flipperzero I've got the logs, cheers, this is very helpful.
Are you using systemd ? Trying to understand if the "Signal 15 (SIGTERM) received" logs happen because you ran systemctl restart on the misfin service, that would make sense.
π flipperzero Β· Apr 12 at 16:50:
@cipres right right, those are instances in which I noticed that even though the daemon indicated operation that the gemini frontend was not displaying connection, thus the restarts via systemctl.
π¦ roughnecks Β· Apr 12 at 17:23:
I believe it happened to me too. systemd unit for me as well.
π cipres [OP] Β· Apr 12 at 18:49:
@flipperzero Got it, thank you. You have a pretty long experience running a public misfin server now !
I fixed one of the things that appears in the logs. I think i'm gonna add some stress tests to pinpoint what makes the server fail under strain, maybe it's just too many threads and it dies.
I'll keep adding emojis everywhere in the code until i figure this one out.
π flipperzero Β· Apr 29 at 20:45:
@cipres glad to be right there along with ya in the honor of your presence, cipres!
Now, that said, it seems we're running into the issue again. Once more, no apparent status messages indicating any correlation to anything conflicting. Just seems to keep stuck on refresh, and then eventually no longer loads til daemon restart. Let me know what I can search for to provide any more details.
π cipres [OP] Β· Apr 30 at 08:21:
@flipperzero Ok. I gotta say, i'm not sure what's causing this. I don't think that the python 3.14 migration is causing this (going back to 3.9 isn't really an option anyway).Anything useful in the logs ?
@flipperzero Just tried to access the hashnix server right now and it doesn't seem down (connection isn't refused) but it doesn't load any page, just stuck. Yeah that is really strange ...
π flipperzero Β· Apr 30 at 11:14:
@cipres nothing conclusive in the logs except for the following -
that's it :l
π cipres [OP] Β· Apr 30 at 22:01:
@flipperzero Can you please edit the misfind.toml and add:
so that we get more information from the logs ? Can you share the content of your current misfind.toml ?
I feel like this is a resource exhaustion issue. Server using too many threads, sockets, whatever .. to the point where it stops functioning. Another possibility is that the rate limiting system starts to malfunction and just blocks every request. That coincides with what you're seeing on your server. Is the CPU load high when the servers freezes ?
π flipperzero Β· Apr 30 at 23:50:
@cipres do I add this under the [service.main] section, or under the others entitled [service.main.ratelimit.anonymous] or [service.main.ratelimit.verified]? In fact, until you had made mention, I hadn't ever noticed this file or that there were these other fields. Would any info within those fields help you better? This is what they print.
Let me know at your soonest convenience, thanks much and always.
π cipres [OP] Β· May 01 at 10:14:
@flipperzero Add them at the top of the config file (outside of the service section). Your rate limit config has the default values, that's fine then. Then you can restart the daemon and we should get more logs.
π flipperzero Β· May 02 at 01:15:
@cipres I sent you an email now, at your proton address, hopefully it shows it sent from my hashnix email host. Let me know if this helps, and if the message gets to you.
π cipres [OP] Β· May 02 at 15:57:
@flipperzero I can't access the log file's URL you sent me (not found error).
π flipperzero Β· May 02 at 16:14:
apologies, there was a typo, try it now
π cipres [OP] Β· May 02 at 16:51:
@flipperzero Got the log, very helpful. I'll fix what's generating these traceback errors but that's not what would cause a crash. Did a crash occur in the time window covered by this log (for any of those 5 srv restarts) ?
π flipperzero Β· May 02 at 16:56:
@cipres those restarts were when crashes occured or at least before hand of, and from up to that point after the end of April to May 1st there have been no restarts and still the misfin page stays stuck loading as these messages continue to occur.
π cipres [OP] Β· May 02 at 17:37:
@flipperzero Enough memory and disk space available on your server ? This is really puzzling.
π cipres [OP] Β· May 02 at 17:49:
@flipperzero I was able to reproduce the issue, will push a fix hopefully today, otherwise tomorrow. Thanks again !
π flipperzero Β· May 02 at 18:43:
@cipres 8gb RAM, 8 cores that max at 2.8 - 3ghz
** edit - apologies, just seen you found the fix. Canβt wait for the update!
π cipres [OP] Β· May 02 at 19:32:
@flipperzero New release is out. It changes the default worker threads configuration. Not sure 100% this will fix it. Do an update and restart the service, and then i'll do a stress test on the server.
π flipperzero Β· May 02 at 19:44:
@cipres uppdated and running now
π cipres [OP] Β· May 02 at 19:51:
@flipperzero Ok. The frontend seems to respond faster than before. Let's wait and see, there's a small chance that it's fixed now but i wouldn't bet too much on it.
Original Post
@flipperzero The hashnix.club misfin server is down, looks like it has been down for a couple of days.