Regression for variable speed in R

I was tasked with developing a regression model that studies student participation in various programs. This is a very good clean dataset in which the student count is consistent with the Poisson distribution. I fit the model into R (using both GLM and Zero Inflated Poisson.) The resulting remnants seemed reasonable.

However, I was then instructed to change the number of students to a β€œcourse” that was calculated as pupils / school _population (each school has its own population.)) Now this is no longer a variable in the score, but a proportion between 0 and 1. This is considered a β€œshare " in a programme.

This β€œrate” (students / population) is no longer Poisson, but, of course, is not normal. So, I am a little lost regarding the corresponding distribution and the subsequent model for representing it.

The normal distribution of the logs seems to fit this speed parameter well, however I have a lot of 0 values, so it doesn't actually fit.

Any suggestions on the best form of distribution for this new parameter and how to model it in R?

Thank!

+5
source share
1 answer

As stated in the comments, you can save the Poisson model and do it with an offset:

glm(response~predictor1+predictor2+predictor3+ ... + offset(log(population),
     family=poisson,data=...)

Or you can use binomial GLM, or

glm(cbind(response,pop_size-response) ~ predictor1 + ... , family=binomial,
        data=...)

or

glm(response/pop_size ~ predictor1 + ... , family=binomial,
        weights=pop_size,
        data=...)

, . , link log to logit, family=binomial(link="log")), .

+ ( , pscl, ZIP, , , ), , .

, glmmADMB , .

+5

All Articles