Skip to content

Commit

Permalink
Revert "Fixes Bug 713973, Bug 1042420, Bug 1044183, Bug 913581, Bug 9…
Browse files Browse the repository at this point in the history
…99923, deprecate & remove all the things"
  • Loading branch information
twobraids committed Oct 23, 2015
1 parent 6bc7f67 commit 7db2349
Show file tree
Hide file tree
Showing 117 changed files with 29,180 additions and 19 deletions.
132 changes: 132 additions & 0 deletions docs/development/commonconfig.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
.. index:: commonconfig

.. _commonconfig-chapter:


Common Config
=============

To avoid repetition between configurations of a half dozen
independently running applications, common settings are consolidated
in a common configuration file:
``OB.../scripts/config/commonconfig.py.dist``.

All Socorro applications have these constants available to them. For a
Socorro applications that are command line driven, each of these
default values can be overidden by a command line switch of the same
name.

To setup this configuration file, just copy the example,
``.../scripts/config/commonconfig.py.dist`` to
``.../scripts/config/commonconfig.py``.

Edit the file for your local situation.::

import socorro.lib.ConfigurationManager as cm
import datetime
import stat

#---------------------------------------------------------------------------
# Relational Database Section

databaseHost = cm.Option()
databaseHost.doc = 'the hostname of the database servers'
databaseHost.default = 'localhost'

databasePort = cm.Option()
databasePort.doc = 'the port of the database on the host'
databasePort.default = 5432

databaseName = cm.Option()
databaseName.doc = 'the name of the database within the server'
databaseName.default = ''

databaseUserName = cm.Option()
databaseUserName.doc = 'the user name for the database servers'
databaseUserName.default = ''

databasePassword = cm.Option()
databasePassword.doc = 'the password for the database user'
databasePassword.default = ''

#---------------------------------------------------------------------------
# Crash storage system

jsonFileSuffix = cm.Option()
jsonFileSuffix.doc = 'the suffix used to identify a json file'
jsonFileSuffix.default = '.json'

dumpFileSuffix = cm.Option()
dumpFileSuffix.doc = 'the suffix used to identify a dump file'
dumpFileSuffix.default = '.dump'

#---------------------------------------------------------------------------
# HBase storage system

hbaseHost = cm.Option()
hbaseHost.doc = 'Hostname for hbase hadoop cluster. May be a VIP or load balancer'
hbaseHost.default = 'localhost'

hbasePort = cm.Option()
hbasePort.doc = 'hbase port number'
hbasePort.default = 9090

hbaseTimeout = cm.Option()
hbaseTimeout.doc = 'timeout in milliseconds for an HBase connection'
hbaseTimeout.default = 5000

#---------------------------------------------------------------------------
# misc

processorCheckInTime = cm.Option()
processorCheckInTime.doc = 'the time after which a processor is considered dead (hh:mm:ss)'
processorCheckInTime.default = "00:05:00"
processorCheckInTime.fromStringConverter = lambda x: str(cm.timeDeltaConverter(x))

startWindow = cm.Option()
startWindow.doc = 'The start of the single aggregation window (YYYY-MM-DD [hh:mm:ss])'
startWindow.fromStringConverter = cm.dateTimeConverter

deltaWindow = cm.Option()
deltaWindow.doc = 'The length of the single aggregation window ([dd:]hh:mm:ss)'
deltaWindow.fromStringConverter = cm.timeDeltaConverter

defaultDeltaWindow = cm.Option()
defaultDeltaWindow.doc = 'The length of the single aggregation window ([dd:]hh:mm:ss)'
defaultDeltaWindow.fromStringConverter = cm.timeDeltaConverter

# override this default for your particular cron task
defaultDeltaWindow.default = '00:12:00'

endWindow = cm.Option()
endWindow.doc = 'The end of the single aggregation window (YYYY-MM-DD [hh:mm:ss])'
endWindow.fromStringConverter = cm.dateTimeConverter

startDate = cm.Option()
startDate.doc = 'The start of the overall/outer aggregation window (YYYY-MM-DD [hh:mm])'
startDate.fromStringConverter = cm.dateTimeConverter

deltaDate = cm.Option()
deltaDate.doc = 'The length of the overall/outer aggregation window ([dd:]hh:mm:ss)'
deltaDate.fromStringConverter = cm.timeDeltaConverter

initialDeltaDate = cm.Option()
initialDeltaDate.doc = 'The length of the overall/outer aggregation window ([dd:]hh:mm:ss)'
initialDeltaDate.fromStringConverter = cm.timeDeltaConverter

# override this default for your particular cron task
initialDeltaDate.default = '4:00:00:00'

minutesPerSlot = cm.Option()
minutesPerSlot.doc = 'how many minutes per leaf directory in the date storage branch'
minutesPerSlot.default = 1

endDate = cm.Option()
endDate.doc = 'The end of the overall/outer aggregation window (YYYY-MM-DD [hh:mm:ss])'
endDate.fromStringConverter = cm.dateTimeConverter

debug = cm.Option()
debug.doc = 'do debug output and routines'
debug.default = False
debug.singleCharacter = 'D'
debug.fromStringConverter = cm.booleanConverter
1 change: 1 addition & 0 deletions docs/development/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@ Contributing
fs
database
package
commonconfig
python-dependencies
addaservice
2 changes: 2 additions & 0 deletions docs/development/generalarchitecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ Here are descriptions of every submodule in there:
+-------------------+---------------------------------------------------------------+
| database | PostgreSQL related code. |
+-------------------+---------------------------------------------------------------+
| deferredcleanup | Osolete. |
+-------------------+---------------------------------------------------------------+
| external | Here are APIs related to external resources like databases. |
+-------------------+---------------------------------------------------------------+
| integrationtest | Osolete. |
Expand Down
30 changes: 30 additions & 0 deletions docs/development/glossary/collector.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,33 @@ system.

After a crash is saved, there is an app called :ref:`crashmover-chapter` that
will transfer the crashes to HBase.

Collector Python Configuration
------------------------------

Like all the Socorro applications, the configuration is actually
executable Python code. Two configuration files are relevant for
collector

* Copy ``.../scripts/config/commonconfig.py.dist`` to
`.../config/commonconfig.py`. This configuration file contains
constants used by many of the Socorro applications.
* Copy ``.../scripts/config/collectorconfig.py.dist`` to
``.../config/collectorconfig.py``

Common Configuration
--------------------

There are two constants in '.../scripts/config/commonconfig.py' of
interest to collector: `jsonFileSuffix`, and `dumpFileSuffix`. Other
constants in this file are ignored.

To setup the common configuration, see :ref:`commonconfig-chapter`.

Collector Configuration
-----------------------

collectorconfig.py has several options to adjust how files are stored:

`See sample config code on Github
<https://github.com/mozilla/socorro/blob/master/scripts/config/collectorconfig.py.dist>`_
79 changes: 78 additions & 1 deletion docs/development/glossary/crashmover.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,81 @@ Crash Mover

The :ref:`collector-chapter` dumps all the crashes that it receives into the
local file system. This application is responsible for transferring
those crashes into primary storage, Amazon S3.
those crashes into hbase.

**Configuration**::

import stat
import socorro.lib.ConfigurationManager as cm

#-------------------------------------------------------------------------------
# general

numberOfThreads = cm.Option()
numberOfThreads.doc = 'the number of threads to use'
numberOfThreads.default = 4

#-------------------------------------------------------------------------------
# source storage

sourceStorageClass = cm.Option()
sourceStorageClass.doc = 'the fully qualified name of the source storage class'
sourceStorageClass.default = 'socorro.storage.crashstorage.CrashStorageSystemForLocalFS'
sourceStorageClass.fromStringConverter = cm.classConverter

from config.collectorconfig import localFS
from config.collectorconfig import localFSDumpDirCount
from config.collectorconfig import localFSDumpGID
from config.collectorconfig import localFSDumpPermissions
from config.collectorconfig import localFSDirPermissions
from config.collectorconfig import fallbackFS
from config.collectorconfig import fallbackDumpDirCount
from config.collectorconfig import fallbackDumpGID
from config.collectorconfig import fallbackDumpPermissions
from config.collectorconfig import fallbackDirPermissions

from config.commonconfig import jsonFileSuffix
from config.commonconfig import dumpFileSuffix

#-------------------------------------------------------------------------------
# destination storage

destinationStorageClass = cm.Option()
destinationStorageClass.doc = 'the fully qualified name of the source storage class'
destinationStorageClass.default = 'socorro.storage.crashstorage.CrashStorageSystemForHBase'
destinationStorageClass.fromStringConverter = cm.classConverter

from config.commonconfig import hbaseHost
from config.commonconfig import hbasePort
from config.commonconfig import hbaseTimeout

#-------------------------------------------------------------------------------
# logging

syslogHost = cm.Option()
syslogHost.doc = 'syslog hostname'
syslogHost.default = 'localhost'

syslogPort = cm.Option()
syslogPort.doc = 'syslog port'
syslogPort.default = 514

syslogFacilityString = cm.Option()
syslogFacilityString.doc = 'syslog facility string ("user", "local0", etc)'
syslogFacilityString.default = 'user'

syslogLineFormatString = cm.Option()
syslogLineFormatString.doc = 'python logging system format for syslog entries'
syslogLineFormatString.default = 'Socorro Storage Mover (pid %(process)d): %(asctime)s %(levelname)s - %(threadName)s - %(message)s'

syslogErrorLoggingLevel = cm.Option()
syslogErrorLoggingLevel.doc = 'logging level for the log file (10 - DEBUG, 20 - INFO, 30 - WARNING, 40 - ERROR, 50 - CRITICAL)'
syslogErrorLoggingLevel.default = 10

stderrLineFormatString = cm.Option()
stderrLineFormatString.doc = 'python logging system format for logging to stderr'
stderrLineFormatString.default = '%(asctime)s %(levelname)s - %(threadName)s - %(message)s'

stderrErrorLoggingLevel = cm.Option()
stderrErrorLoggingLevel.doc = 'logging level for the logging to stderr (10 - DEBUG, 20 - INFO, 30 - WARNING, 40 - ERROR, 50 - CRITICAL)'
stderrErrorLoggingLevel.default = 10
116 changes: 116 additions & 0 deletions docs/development/glossary/deferredcleanup.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
.. index:: deferredcleanup

.. _deferredcleanup-chapter:


Deferred Cleanup
================

When the :ref:`collector-chapter` throttles the flow of crash dumps, it saves
deferred crashes into :ref:`deferredjobstorage-chapter`. These JSON/dump pairs will
live in deferred storage for a configurable number of days. It is the
task of the deferred cleanup application to implement the policy to
delete old crash dumps.

The deferred cleanup application is a command line app meant to be run
via as a cron job. It should be set to run once every twenty-four
hours.

Configuration
-------------

deferredcleanup uses the common configuration for to get the constant
deferredStorageRoot. For setup of common configuration, see
:ref:`commonconfig-chapter`.

deferredcleanup also has an executable configuration file of its own.
A sample file is found at
``.../scripts/config/deferredcleanupconfig.py.dist``. Copy this file to
``.../scripts/config/deferredcleanupconfig.py`` and edit it for site
specific settings.

In each case where a site specific value is desired, replace the value
for the .default member.

**maximumDeferredJobAge**

This constant specifies how many days deferred jobs are allowed to
stay in deferred storage. Job deletion is permanent.::

maximumDeferredJobAge = cm.Option()
maximumDeferredJobAge.doc = 'the maximum number of days that deferred jobs stick around'
maximumDeferredJobAge.default = 2

**dryRun**

Used during testing and development, this prevents deferredcleanup
from actually deleting things.::

dryRun = cm.Option()
dryRun.doc = "don't really delete anything"
dryRun.default = False
dryRun.fromStringConverter = cm.booleanConverter

**logFilePathname**

Deferredcleanup can log its actions to a set of automatically rotating
log files. This is the name and location of the logs.::

logFilePathname = cm.Option()
logFilePathname.doc = 'full pathname for the log file'
logFilePathname.default = './processor.log'

**logFileMaximumSize**

This is the maximum size in bytes allowed for a log file. Once this
number is achieved, the logs rotate and a new log is started.::

logFileMaximumSize = cm.Option()
logFileMaximumSize.doc = 'maximum size in bytes of the log file'
logFileMaximumSize.default = 1000000

**logFileMaximumBackupHistory**

The maximum number of log files to keep.::

logFileMaximumBackupHistory = cm.Option()
logFileMaximumBackupHistory.doc = 'maximum number of log files to keep'
logFileMaximumBackupHistory.default = 50

**logFileLineFormatString**

A Python format string that controls the format of individual lines in
the logs::

logFileLineFormatString = cm.Option()
logFileLineFormatString.doc = 'python logging system format for log file entries'
logFileLineFormatString.default = '%(asctime)s %(levelname)s - %(message)s'

**logFileErrorLoggingLevel**

Logging is done in severity levels - the lower the number, the more
verbose the logs.::

logFileErrorLoggingLevel = cm.Option()
logFileErrorLoggingLevel.doc = 'logging level for the log file (10 - DEBUG, 20 - INFO, 30 - WARNING, 40 - ERROR, 50 - CRITICAL)'
logFileErrorLoggingLevel.default = 20

**stderrLineFormatString**

In parallel with creating log files, Monitor can log to stderr. This
is a Python format string that controls the format of individual lines
sent to stderr.::

stderrLineFormatString = cm.Option()
stderrLineFormatString.doc = 'python logging system format for logging to stderr'
stderrLineFormatString.default = '%(asctime)s %(levelname)s - %(message)s'

**stderrErrorLoggingLevel**

Logging to stderr is done in severity levels independently from the
log file severity levels - the lower the number, the more verbose the
output to stderr.::

stderrErrorLoggingLevel = cm.Option()
stderrErrorLoggingLevel.doc = 'logging level for the logging to stderr (10 - DEBUG, 20 - INFO, 30 - WARNING, 40 - ERROR, 50 - CRITICAL)'
stderrErrorLoggingLevel.default = 40
Loading

0 comments on commit 7db2349

Please sign in to comment.