-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-stop segfaults when we switch to a non-local #124
Comments
this possibly looks to be a race condition where you get an incoming REGISTER before everything is initialized. Are you able to recreate the crash outside of that condition? |
also, can you provide a log at debug level for both drachtio server and sofia |
Traffic comes thick and fast - there are quite a few thousand clients. So if drachtio crashes at a busy time you can be sure that as soon as it reads incoming traffic it will get a packet. It is curious that it doesn't happen with the local request handler - every time it gets into this cycle of crashes we can change the request handler back to the local one and it comes up no problem. I'll provide the extra logging etc but bear with me, we need to do some roll-back since we can't complete the move to the new k8s cluster. Thanks, |
For completeness, in another example it didn't crash immediately - responded to 55 registers and then crashed on the next
54 more registers arrive and are responded to then:
|
My colleague reports:
So from that report it can't be a race during initialisation? - drachtio been running a long time and the client inbound connection all setup. Still crashes. |
agree, I'll wait for the logs ... |
any luck in gathering logs? |
Hi Dave, Because it is disruptive to the service we have to wait until our late evening to do the testing again. That is in say 6 hours time. |
@davehorton You won't be happy with this update. However: If we run drachtio-server against our remote request-handler with normal logging we get that segfault very often:
If we push the sofia logging up to 9 and switch drachtio logging to debug:
Then we don't get any crashes. All I can guess is that the time taken to do the logging is enough to slow something down so that the race is avoided. So we can't give you debug level logs of the crash since there is no crash if the logging is set to debug level. We restarted Drachtio a bunch of times in case it was intermittent and cou;dn't get any crash to happen. How do you want us to proceed? Thanks, |
could you try with drachtio at debug and sofia at log level 3? |
also, your logs are showing an error right after startup, before any calls:
I'd like to get to the bottom of this as well. Can you just show me a log with debug level (drachtio and sofia) after startup? In that log it should not be necessary to receive any calls |
This is from startup for the first few seconds. As it start a bunch of packets arrive and I cut the log after the first few replies go back. My drachtio config binds to "*" like so, which I guess is where the loopback address come from:
Steve |
We are using a request-handler with our drachtio.
We are trying our k8s and switched this to another instance that is about 25 milliseconds away.
In every other respect the service returns the same response - we tested with contructing requests with curl, eg:
When we switch over then drachtio segfaults on the first packet it tries to process (or maybe first register?)
and so on and so forth.
The coredump says that the crash is like so:
digging into the stack frames:
The text was updated successfully, but these errors were encountered: