Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing of on_error exception hanging #94

Merged
merged 5 commits into from
Mar 19, 2019

Conversation

tofu-rocketry
Copy link
Member

@tofu-rocketry tofu-rocketry commented Mar 6, 2019

Resolves #92.

The key parts of this change are the swap from self._conn.connect(wait=True) to =False and the removal of the raising of an exception in on_error - these changes allow SSM to retry the connection after a timeout (rather than blocking) and to then exit gracefully once all brokers have been tried.

start_connection has been rearranged to check if connected just after connecting, rather than after creating a subscription, so that a subscription is not attempted without a connection. This is to mitigate the removal of the connection wait.

Additionally, the error message you get if you try to connect without being authorised has been improved (no more Java error dump!) and the version of stomp.py in use is logged.

Example output:

2019-03-06 14:03:37,150 - ssmsend - INFO - ========================================
2019-03-06 14:03:37,150 - ssmsend - INFO - Starting sending SSM version 2.3.0.
2019-03-06 14:03:37,150 - ssmsend - INFO - Retrieving broker details from ldap://lcg-bdii.cern.ch:2170 ...
2019-03-06 14:03:37,240 - ssmsend - INFO - Found 2 brokers.
2019-03-06 14:03:37,250 - ssmsend - INFO - No server certificate supplied.  Will not encrypt messages.
2019-03-06 14:03:37,319 - ssm.ssm2 - INFO - Using stomp.py version 3.1.6.
2019-03-06 14:03:37,319 - ssm.ssm2 - INFO - Established connection to mq.cro-ngi.hr, port 6162
2019-03-06 14:03:37,319 - ssm.ssm2 - INFO - Connecting using SSL...
2019-03-06 14:03:38,161 - ssm.ssm2 - ERROR - The following certificate is not authorised:  CN=bob, L=SWI, OU=PRA, O=eScience, C=UK
2019-03-06 14:03:40,160 - stomp.py - ERROR - Lost connection
2019-03-06 14:03:40,160 - ssm.ssm2 - INFO - Disconnected from broker.
2019-03-06 14:03:48,371 - ssm.ssm2 - WARNING - Failed to connect to mq.cro-ngi.hr:6162: Timed out while waiting for connection. Check the connection details.
2019-03-06 14:03:48,371 - ssm.ssm2 - INFO - Established connection to broker-prod1.argo.grnet.gr, port 6162
2019-03-06 14:03:48,371 - ssm.ssm2 - INFO - Connecting using SSL...
2019-03-06 14:03:48,871 - ssm.ssm2 - ERROR - The following certificate is not authorised:  CN=bob, L=SWI, OU=PRA, O=eScience, C=UK
2019-03-06 14:03:50,875 - stomp.py - ERROR - Lost connection
2019-03-06 14:03:50,875 - ssm.ssm2 - INFO - Disconnected from broker.
2019-03-06 14:03:59,045 - ssm.ssm2 - WARNING - Failed to connect to broker-prod1.argo.grnet.gr:6162: Timed out while waiting for connection. Check the connection details.
SSM failed to complete successfully.  See log file for details.
2019-03-06 14:03:59,045 - ssmsend - ERROR - SSM failed to complete successfully: Attempts to start the SSM failed.  The system will exit.
2019-03-06 14:03:59,045 - ssm.ssm2 - INFO - SSM connection ended.
2019-03-06 14:03:59,045 - ssmsend - INFO - SSM has shut down.
2019-03-06 14:03:59,045 - ssmsend - INFO - ========================================


@tofu-rocketry tofu-rocketry added this to the 2.4.0 milestone Mar 6, 2019
@tofu-rocketry tofu-rocketry self-assigned this Mar 6, 2019
@tofu-rocketry tofu-rocketry requested review from jrha, gregcorbett and a user March 6, 2019 13:41
@tofu-rocketry tofu-rocketry marked this pull request as ready for review March 6, 2019 14:12
Copy link
Member

@gregcorbett gregcorbett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

- Remove raising of exception in on_error as it's not the right place to
do that and just leaves a mess in the log.
- Make the error message clearer if the certificate is not authorised.
- Increase log level for these messages to ERROR to match severity.
Disable wait on STOMP connection so that errors during connection do
not freeze SSM.
Rearrange start_connection to check if connected just after connecting
so that subscription is not attempted without a connection.
Tweak stomp version message in log to format it nicely, and move to
handle_connect so that it's only printed for each run of SSM rather than
once for each broker that's attempted to connect to.
@tofu-rocketry tofu-rocketry merged commit 055aeb2 into apel:dev Mar 19, 2019
@tofu-rocketry tofu-rocketry deleted the 92-exception-hang branch March 19, 2019 14:59
Copy link
Contributor

@jrha jrha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants