Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying intercept models is not documented & a PITA #13

Closed
fabian-s opened this issue Apr 8, 2016 · 6 comments
Closed

Specifying intercept models is not documented & a PITA #13

fabian-s opened this issue Apr 8, 2016 · 6 comments

Comments

@fabian-s
Copy link
Contributor

fabian-s commented Apr 8, 2016

I think having to do

data$ones <- rep(1, n)
gamboostLSS(list( .... ~ ...,  bla = response ~ bols(ones, intercept =FALSE)), 
  data=data, families=SomethingCraycray())

instead of

 bla = response ~ 1

sucks major d*** in terms of usability. At least it should be documented somewhere that this is the way we want users to specify an intercept model / formula.....

@ja-thomas
Copy link
Member

This is already a problem of mboost not just gamboostLSS

library(gamboostLSS)


data(cars)
gamboost(dist ~ 1, data = cars, dfbase = 4)

gamboostLSS(dist ~ 1, data = cars)

We should put it in the description (ideally in gamboost and gamboostLSS) and check if there is an easy fix for that in mboost, but we have to fix the problem there.

@fabian-s
Copy link
Contributor Author

not sure I agree.

that issue never comes up for (non-pathological specifications of) mboost models because why would ever want to boost an intercept model? However, intercept models do make sense for GAMLSS-type models because you may want to restrict the flexibility of additive predictors for higher order moments / nuisance parameters (to cut computation times, remain interpretable, etc).

@hofnerb
Copy link
Member

hofnerb commented Apr 11, 2016

Well, the point probably is that it usually doesn't make sense in mboost but it should be implemented there anyway as all the interfaces for model fitting are provided from mboost. Perhaps one should try to interpret 1 generally as intercept. Thus instead of

cars$int <- 1
gamboost(dist ~ bols(int, intercept = FALSE) + bols(..., intercept = FALSE), data = cars)

one could then write

gamboost(dist ~ bols(1, intercept = FALSE) + bols(..., intercept = FALSE), data = cars)
## or even better
gamboost(dist ~ 1 + bols(..., intercept = FALSE), data = cars)

1 should then always be defined as bols(rep(1, nrow(data), intercept = FALSE).

@fabian-s
Copy link
Contributor Author

Do we agree that there is no realistic use case for a pure intercept base learner in mboost, but that there is one for additive predictors in gamboostLSS?

If so, I think it's a user interface / formula parsing issue for gamboostLSS (i.e., a ~1 formula should just add the missing columns ones to the data and treat ~ 1 as ~ bols(ones, intercept = FALSE)), not a missing feature in mboost.

If not, when/why would I ever want to specify a naked intercept in mboost and, if we make it easy to do so, how would we preempt user error & misunderstandings about the fact that every base learner updates its own intercept by default anyways?

@fabian-s
Copy link
Contributor Author

Just to be clear:

I don't think we need to / should enable formulas ~ 1 + bols(bla) + bbbs(blub).

I do think having a shorthand for "this parameter is not affected by any covariates" via nuisance_param = response ~1 would be useful, as the default of recycling the first formula for all parameters of the distribution means that models get insanely complicated very quickly and specifying simplifications is a huge pain ATM (and not documented anywhere!)

@fabian-s
Copy link
Contributor Author

@hofnerb @ja-thomas just re-read your comments, you're right of course, I was being a Gscheithaferl 😏
Closing this and migrating it to mboost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants