r - dplyr summarize with a function of a dataframe -


i'm having trouble carrying out routine using dplyr package. in short, have function takes dataframe input, , returns single (numeric) value; i'd able apply function several subsets of dataframe. feels should able use group_by() specify subsets of dataframe, pipe along summarize() function, i'm not sure how pass (subsetted) dataframe along function i'd apply.

as simplified example, let's i'm using iris dataset, , i've got simple function i'd apply several subsets of data:

data(iris) lm.func = function(.data){   lm.fit = lm(petal.width ~ petal.length, data = .data)   out = summary(lm.fit)$coefficients[2,1]   return(out) } 

now, i'd able apply function subsets of iris based on other variable, species. i'm able manually filter data, pipe along function, example:

iris %>% filter(species == "setosa") %>% lm.func(.) 

but i'd able apply lm.func each subset of data, based on species. first thought try following:

iris %>% group_by(species) %>% summarize(coef.val = lm.func(.)) 

even though know doesn't work, idea try pass each subset of iris lm.func function.

to clarify, i'd end dataframe 2 columns -- first each level of grouping variable, , second output of lm.func when data restricted subset specified grouping variable.

is possible use summarize() in way?

you can try do

 iris %>%        group_by(species) %>%       do(data.frame(coef.val=lm.func(.)))  #     species  coef.val  #1     setosa 0.2012451  #2 versicolor 0.3310536  #3  virginica 0.1602970 

Comments

Popular posts from this blog

google chrome - Developer tools - How to inspect the elements which are added momentarily (by JQuery)? -

angularjs - Showing an empty as first option in select tag -

php - Cloud9 cloud IDE and CakePHP -