r - dplyr arrange() function sort by missing values -
i attempting work through hadley wickham's r data science , have gotten tripped on following question: "how use arrange() sort missing values start? (hint: use is.na())" using flights dataset included in nycflights13 package. given arrange() sorts unknown values bottom of dataframe, not sure how 1 opposite across missing values of variables. realize question can answered base r code, interested in how done using dplyr , call arrange() , is.na() functions. thanks.
we can wrap desc
missing values @ start
flights %>% arrange(desc(is.na(dep_time)), desc(is.na(dep_delay)), desc(is.na(arr_time)), desc(is.na(arr_delay)), desc(is.na(tailnum)), desc(is.na(air_time)))
the na values found in variables based on
names(flights)[colsums(is.na(flights)) >0] #[1] "dep_time" "dep_delay" "arr_time" "arr_delay" "tailnum" "air_time"
instead of passing each variable name @ time, can use nse arrange_
nm1 <- paste0("desc(is.na(", names(flights)[colsums(is.na(flights)) >0], "))") r1 <- flights %>% arrange_(.dots = nm1) r1 %>% head() #year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum # <int> <int> <int> <int> <int> <dbl> <int> <int> <dbl> <chr> <int> <chr> #1 2013 1 2 na 1545 na na 1910 na aa 133 <na> #2 2013 1 2 na 1601 na na 1735 na ua 623 <na> #3 2013 1 3 na 857 na na 1209 na ua 714 <na> #4 2013 1 3 na 645 na na 952 na ua 719 <na> #5 2013 1 4 na 845 na na 1015 na 9e 3405 <na> #6 2013 1 4 na 1830 na na 2044 na 9e 3716 <na> #variables not shown: origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, # time_hour <time>.
update
with newer versions of tidyverse (dplyr_0.7.3
, rlang_0.1.2
) , can make use of arrange_at
, arrange_all
, arrange_if
nm1 <- names(flights)[colsums(is.na(flights)) >0] r2 <- flights %>% arrange_at(vars(nm1), funs(desc(is.na(.))))
or use arrange_if
f <- rlang::as_function(~ any(is.na(.))) r3 <- flights %>% arrange_if(f, funs(desc(is.na(.)))) identical(r1, r2) #[1] true identical(r1, r3) #[1] true
Comments
Post a Comment