-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for pyfixest
#105
base: master
Are you sure you want to change the base?
Support for pyfixest
#105
Conversation
Just started to play around more with |
This would be a really useful feature. |
Yes, definitely this is a very welcome contribution. And the PR code looks good to me so far! |
Awesome, then I will take a second look at the PR tomorrow evening to prepare it for a proper review =) @toobaz, so far, pyfixest does not support a range of default values (e.g. F-statistics). Would it be ok to merge the PR as is and update the pyfixest classes in a second, later PR? Else I would prioritize to add these statistics in the next week. Best, Alex |
You mean this PR? Or the pyfixest project itself? In the first case: I would prefer a PR that produces, at least in the most basic/standard case, a complete output just as a pyfixest user would likely expect it. In the second case: just get as close as possible to the above given the current pyfixest upstream code base, and it will be fine. |
I was referring to pyfixest itself - so far, we haven't implemented a range of statistics, e.g. F-tests and adjusted R2. So I indeed think that the current PR produces all of the supported and expected output of pyfixest =) @aeturrell , as a pyfixest user, do you agree? =) I'd love to add some other information, e.g. columns for the employed fixed effects, but I think this might be better suited as a utils module within the pyfixest code base? I.e. smth along these lines: from stargazer.stargazer import Stargazer
import pyfixest as pf
fit = pf.feols("Y ~ csw(X1, X2) | csw0(f1, f2)", data=pf.get_data())
def deparse_fixef_for_stargazer(fixef_list: list):
"""
Deparse Feols._fixef to a dict of lists for easy use with Stargazer
to add fixed effects to the regression table.
Parameters
----------
fixef_list : list
List of fixed effects from Feols._fixef.
Returns
-------
dict
Dictionary of lists, where each list contains the fixed effects for a given variable.
Example
-------
# basic example
fixef_list = ['f1', 'f2', 'f1+f2', 'f1', 'f2', 'f1+f2']
deparse_fixef_for_stargazer(fixef_list)
# Output
{'f1': ['f1', '-', 'f1', 'f1', '-', 'f1'],
'f2': ['-', 'f2', 'f2', '-', 'f2', 'f2']
}
# for usage with Stargazer
from stargazer.stargazer import Stargazer
from stargazer.line_location import LineLocation
import pyfixest as pf
data = pf.get_data()
fit = pf.feols("Y ~ X1 + X2 | f1 + f2", data=data)
result_table = Stargazer(fit)
deparsed_fixef_lists = deparse_fixef_for_stargazer([x._fixef for x in fit.to_list()])
for key in deparsed_fixef_lists:
result_table.add_line(key, deparsed_fixef_lists[key])
"""
def identify_variables(lst):
variables = set()
for item in lst:
if item:
parts = item.split('+')
for part in parts:
variables.add(part)
return list(variables)
# Identify unique variables
unique_variables = identify_variables(fixef_list)
# Initialize a dictionary to hold the lists for each variable
variable_lists = {var: [] for var in unique_variables}
# Populate the lists based on the presence of each variable
for item in fixef_list:
for var in unique_variables:
if item and var in item:
variable_lists[var].append('x')
else:
variable_lists[var].append('-')
return variable_lists which could be used as deparsed_fixef_lists = deparse_fixef_for_stargazer([x._fixef for x in fit.to_list()])
result_table = Stargazer(fit.to_list())
for key in deparsed_fixef_lists:
#print(key, deparse_fixef_lists[key])
result_table.add_line(key, deparsed_fixef_lists[key]) and produce So from my side, I think this PR is ready =) Should I do anything else, e.g. implement some tests @toobaz? |
Yes and no: I am already planning, and testing in my own research code, a way to add "smart lines" that associate to each model a with a key and will use that key to retrieve from each included model the content of the related cell (this could be the presence/absence of fixed effects, as in your case, but also more general information, e.g. indication of sample used, variable definition...). This is my current code: def add_custom_lines(table, models, exclude=[]):
for k in models:
if not hasattr(models[k], 'flags'):
models[k].flags = {}
# We want to process each line only once, but in the order
# in which they appear (while making a set of the list would change the order):
seen = set()
all_lines = [f for k in models for f in models[k].flags]
for f in all_lines:
if f in seen or f in exclude:
continue
table.add_line(f, [models[k].flags.get(f, '') for k in models.keys()])
seen.add(f) Hence, Translators could even indicate which of these lines should be displayed by default. "When is this gonna happen?", I hear you ask. Well, I guess in the summer I should be able to get this merged. But should you prefer to meanwhile implement something temporary on the pyfixest side, I wouldn't blame you. My main doubt at the moment is whether it is acceptable in general to ask users to add and populate a |
Hey both, any update on this? Very keen to be able to point people to this magical integration in my teaching, and update Coding for Economists too. Love the idea of smart lines but even the basic functionality that's already in the PR would be a game-changer! |
None on my side... I'm still very interested in evaluating the PR, when it will be ready! |
Hi @toobaz, sorry for never really getting back to you about this! I had a thousand things going on plus was on vacation... Sorry! From my perspective, this PR is ready for a review from your side. I have decided to include a customized |
Hi @s3alfisc , thanks for the update! I wasn't reviewing the PR because it is a "draft", I now gave it more attention. The result looks great, but to be honest I am not enthusiastic about Stargazer having a different interface with I understand the proposal I wrote above is way too uncertain, in particular for what concerns the timing, for you to base your solution on it... but would an (at least temporary) option be that I add some hooks to execute the code you need, defining appropriate functions inside the translator module ( In particular, if I understand correctly, the problem you have with the current state of the Would it be OK if I allowed the translator to define a function called at the end of By the way: I see
... did you happen to test mixing |
Ouhh, yes, just noticed this now, sorry about that!
That's the main reason I added the Stargazer class to pyfixest =) people have been asking about the feature for a while (and I'd also love to use Stargazer with pyfixest myself), so I thought that it would be most convenient for me to simply define a custom class within pyfixest. The main advantage I see for this approach is that it allows me to built a highly opinionated regression table for pyfixest objects, while users interested in a more customized solution could build around the Stargazer class within But I am also happy to build on solution that you've proposed above, it's just a little bit more complicated for me 😄 Out of curiosity, what is the disadvantage that you see with a custom Stargazer class for pyfixest? Is it potentially not having visibility on potentially breaking downstream dependencies with changes to |
This of course fails... for statsmodels, I get an
error. Thanks for the pointer, will fix this asap =) |
It depends.
Whatever the case, I would have to test (well, when Not to mention that if a project comes out with similar necessities to So this said: I would love to support Vice-versa, if timing is the problem, it is a problem solved: I can implement the hooks immediately and do a release, then we can polish the interaction between the two code bases when Note that I have nothing against:
|
Ok, sold! So the next steps would be that
|
Thanks!
Sure. If you agree that having a hook called at the end of For instance, we could have that whenever:
In your case, I guess
Sure. But just to clarify: I'm not pretending I have the right to dictate how other projects should, or shouldn't provide shortcuts for |
Hi @toobaz - this is a draft PR to add basic support for the pyfixest project.
It's still a WIP - I would like to add information on the fixed effects employed in a given model & add support for a range of additional model statistics that pyfixest is not yet computing.
This is how it currently looks:
Would you be happy to accept such a PR? Is there anything else I should take a look at?
Pinging @Asquidy as he'd expressed interest in this feature =)
Merging this PR would close py-econometrics/pyfixest#283.
All the best, Alex