Skip to content

Commit

Permalink
Updated comments
Browse files Browse the repository at this point in the history
  • Loading branch information
bdilday committed Mar 24, 2015
1 parent 8c89eb3 commit cb2211f
Showing 1 changed file with 31 additions and 19 deletions.
50 changes: 31 additions & 19 deletions scripts/retrosheet_sql_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,43 +13,55 @@
events
- playoff_flag
- year_id
- time_since_1900 : an integer giving the number of seconds since Jan 1, 1900, UTC
- time_since_1900 : an integer giving the number of seconds
since Jan 1, 1900, UTC
- tto : times through the order
- sun_alt, sun_az : altitude and azimuth of the sun
- woba_pts : woba_pts for the event
- woba_pts_expected : placeholder for woba_pts expected from the matchup of batter vs pitcher.
- woba_pts_expected : placeholder for woba_pts expected
from the matchup of batter vs pitcher.
the location of the sun computations require PyEphem
http://rhodesmill.org/pyephem/
the time computations require timezone information, to translate everything to a common timezone (UTC)
the time computations require timezone information,
to translate everything to a common timezone (UTC)
pytz: http://pytz.sourceforge.net
tzwhere: https://github.com/pegler/pytzwhere/tree/master/tzwhere
This file can be imported to get acces to the methods, or run via:
This file can be imported to get access to the methods, or run via:
python retrosheet_sql_tools.py
with optional arguments
-minyr minyr
-maxyr maxyr
-vbose vbose
-n2print n2print
NOTE: This script will write out an sql file VARD_(TIMESTAMP).sql, which can then be sourced by your SQL implementaion (VARD stand for Value Added Retrosheet Database; the Value Added temrinology is a nod to my astronomy days, as in "Value Added Galaxy Catalog", http://sdss.physics.nyu.edu/vagc/). It DOES NOT store any variable to the database. It does update the schema, however, by adding columns for the computed variables.
ii. provides a method to read sql data into a numpy array, with automatic determination of variable type. The relevant method is sqlQueryToArray(query_string), which returns a numpy array of result of the query
an example of use is
import retrosheet_sql_tools
configFileLocation = 'config.ini'
minyr = 2004
maxyr = 2004
rs = retrosheet_sql_tools.retrosheet_sql(cfgFile=configFileLocation)
rs.updateSchema()
rs.computeValueAdded(minyr=minyr, maxyr=maxyr)
NOTE: This script will write out an sql file VARD_(TIMESTAMP).sql,
which can then be sourced by your SQL implementaion (VARD stand for
Value Added Retrosheet Database; the Value Added temrinology is a
nod to my astronomy days, as in "Value Added Galaxy Catalog",
http://sdss.physics.nyu.edu/vagc/). It DOES NOT store any variables
to the database. It does update the schema, however, by adding columns
for the computed variables.
ii. provides a method to read sql data into a numpy array, with automatic
determination of variable type. The relevant method is
sqlQueryToArray(query_string),
which returns a numpy array of result of the query
an example of use is
import retrosheet_sql_tools
configFileLocation = 'config.ini'
minyr = 2004
maxyr = 2004
rs = retrosheet_sql_tools.retrosheet_sql(cfgFile=configFileLocation)
rs.updateSchema()
rs.computeValueAdded(minyr=minyr, maxyr=maxyr)
q = 'select tto, avg(woba_pts) as wo, avg(woba_pts_expected) as wx from retrosheet.events where year_id=2004 and woba_pts>=0 and pit_start_fl=\'T\' group by tto'
data = rs.sqlQueryToArray(q)
print data
q = 'select tto, avg(woba_pts) as wo, avg(woba_pts_expected) as wx from retrosheet.events where year_id=2004 and woba_pts>=0 and pit_start_fl=\'T\' group by tto'
data = rs.sqlQueryToArray(q)
print data
'''

Expand Down

0 comments on commit cb2211f

Please sign in to comment.