Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 broke us.states.lookup('District of Columbia') #50

Closed
wwoods opened this issue Apr 21, 2020 · 13 comments
Closed

v2 broke us.states.lookup('District of Columbia') #50

wwoods opened this issue Apr 21, 2020 · 13 comments

Comments

@wwoods
Copy link

wwoods commented Apr 21, 2020

Noting this as an issue because us.states.DC still exists, so there's no reason us.states.lookup should have broken.

@wwoods
Copy link
Author

wwoods commented Apr 21, 2020

Looks like 6d21643 indicates that this change was intentional... Perhaps us.states.DC should not exist. FWIW, I found it handy for dealing with e.g. NYTimes datasets.

@jess-peck-zocdoc
Copy link

+1 to @wwoods. Not having DC anywhere in the mix in v2 (either in state.TERRITORIES or state.STATE or state.OBSOLETE) is a breaking change for me. In social scientist and data scientist applications, state-level data has DC listed in the state column of public datasets.

@allanjamesvestal
Copy link

allanjamesvestal commented Apr 29, 2020

I'll add another +1 — I just can't fathom what kind of intentional design change would remove support for us.states.lookup('DC') while leaving us.states.lookup('MP') intact. And yet, that's where we stand right now.

I guess I'll submit a question to @jcarbaugh, @mileswwatkins and the team: if D.C. isn't a "territory" in the nomenclature of this project, what is it?

As is, I'd imagine this represents a breaking change for a lot of people's use cases on this otherwise fantastic library. I'd hope its authors can fix this quickly within the 2.0.x series.

Edit: I do want to clarify that my complaint is separate from whether to include D.C. in the STATES list — I definitely understand the logic of making that change (and doing so in a v1 -> v2 upgrade path, as you discussed in #32).

My quibble is with the fact that D.C. now appears to be the one area not searched by us.states.lookup() — this function appears to combine the areas listed in STATES_AND_TERRITORIES and OBSOLETE, and based on my quick read-through, the one area left out of consideration is now D.C.

@jcarbaugh
Copy link
Member

Removing DC from lookup() was not intended, so I’ll get a bug fix up and out as soon as I can.

I was very hesitant to remove DC from STATES, but got multiple requests and I think even a PR to remove it, so I went ahead and did it. I’m already working on a branch for 3.0 that was going to restore some support for DC as a STATE, but am not inclined to include it back in 2.x since its presence/removal is such a breaking change.

Questions for those of you that have commented here:

  • Are you running Python 3 and would be able to upgrade to an upcoming us 3.0 or are you still on Python 2.7?
  • If there was some configuration like an environment variable that could be set to restore DC to STATES in the current 2.x version, would that meet your needs?

@allanjamesvestal
Copy link

Thank you for the quick response on that, @jcarbaugh. Again, I definitely understand the decision to remove D.C. from STATES going forward for most users — I've definitely written logic to filter out only states with statehood_year attributes before when I needed to screen out D.C.

For us, we're only running Python 3, so we could upgrade to a 3.0 release on just about all our stuff without much headache. I'm intrigued by the idea of an environment variable for D.C. statehood, but my suspicion is that if lookup() works as intended with your forthcoming bugfix, our use case won't require us to change the makeup of the STATES list at all. So I'll defer to others on that second question.

(I will say, though, I saw your response to Ben Welsh's related tweet, and as a former Kentuckian/current Virginian I'd love to see a COMMONWEALTHS key added at some point, if only for my own eccentricity.)

Thanks again for replying so quickly. We're especially interested in this since we have a similar library for election-specific metadata (us-elections) that relies on your project for much of its basic mappings.

@jcarbaugh
Copy link
Member

I just pushed v2.0.2 to PyPI. It adds DC back to lookup() and mapping(). Apologies for the oversight!

I'm still thinking through the right approach to bringing DC back to STATES in some way in v3.0.

@allanjamesvestal I'm happy to report that v3 will have a COMMONWEALTHS list haha https://github.com/unitedstates/python-us/blob/master/us/states.py#L1424

@jess-peck-zocdoc
Copy link

@jcarbaugh The most unique value contribution I found in this package was the ability to roll a custom dict so I can set keys in either direction for abbreviations (e.g.: us.states.mapping('name', 'abbr')). That functionality appears to be restored so I'm a happy camper. Many thanks for the rapid fix.

Yes this is python 3.0 and yes some sort of "include dc" variable that I can set at the top of a notebook seems fine for my applications.

@jcarbaugh
Copy link
Member

@jess-peck-zocdoc those mappings were the original reason I created this package!

I'm really not familiar with best practices when working with notebooks. Is it possible and viable to require that something like %env DC_STATEHOOD=1 is set before importing us if you want DC to appear in the STATES list? Or are packages actually imported before it would hit that environment variable declaration?

@jess-peck-zocdoc
Copy link

jess-peck-zocdoc commented Apr 29, 2020

I think I don't really want to set environment variables in a notebook if I can avoid it. I'm happy to set a python global though. I have plenty of packages that require an import followed by a declaration of some option or preference in a subsequent call to behave the way I want. In jupyter notebooks locally this might be fine, but running in arbitrary containers or clusters I might not get to change everything about the python env that I might wish (see: databricks).

@wwoods
Copy link
Author

wwoods commented Apr 29, 2020

Re: @jcarbaugh and @jess-peck-zocdoc -- It's probably better to have something like import us; us.configure(dc_statehood=True).

Python 3 on my end.

Thanks for the quick fix push, @jcarbaugh !

@allanjamesvestal
Copy link

allanjamesvestal commented Apr 29, 2020

Thanks for the fix, @jcarbaugh! I think that'll be a real boon to a lot of us.

I have some (probably hastily formed) thoughts about the question of what all is included in STATES from v3 on. Namely, I wonder if D.C. is the only special case worth including — if a number of people are using this library for address drop-downs and the like, does a big enough contingent exist that you'd also want to make Puerto Rico similarly configurable? (If you imagine X percent of users will miss D.C. in STATES, what derivative of X needs to miss Puerto Rico in order for it to be added too?)

At that point, I think there are two reasonable options. Either you have a way to specify zero, one or multiple non-states as being part of STATES, or you leave the task of reconciling individual developers' needs to those devs, since it's inherently defensible for STATES to mean actual stars on the flag, essentially.

If you did want to have a configuration option for D.C. (and others), I'd definitely plus-one the suggestion to make it a method rather than an environment variable, the way @wwoods suggested.

Any rate, I greatly do appreciate your quick response in shipping 2.0.2. I suspect the fix you just coded could close out this particular ticket, so please let me know if you'd like for this more liberal-arts discussion to happen in a different issue.

@jcarbaugh
Copy link
Member

Glad to hear that this fix meets your immediate needs! Appreciate your help here.

@allanjamesvestal @wwoods @jess-peck-zocdoc would you mind if I reached out to you all directly outside of this thread to discuss DC + other behavior for v3? I'd like to better understand your use cases and how the package is being used (e.g. Jupyter notebooks). I think I need to go either in an "assemble it yourself" way or via configuration, which I think would necessitate a significant API change.

@allanjamesvestal
Copy link

Happy to talk about this off-thread if I can be useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants