r - When using ggplot2, can I set the color of histogram bars without potentially obscuring low values? -


when calling geom_histogram() color, , fill arguments, ggplot2 confusingly paint whole x-axis range, making impossible visually distinguish between low value , 0 value.

running following code:

ggplot(esubset, aes(x=exectime)) + geom_histogram(binwidth = 0.5) + theme_bw() + scale_x_continuous(breaks=seq(0,20), limits=c(0,20)) 

will result in

a histogram w/o color attributes

this visually unappealing. fix that, i'd instead use

ggplot(esubset, aes(x=exectime)) + geom_histogram(binwidth = 0.5, colour='black', fill='gray') + theme_bw() + scale_x_continuous(breaks=seq(0,20), limits=c(0,20)) 

which result in

a histogram color attributes

the problem i'll have no way of distinguishing whether exectime contains values past 10, few occurrences of 12, example, hidden behind horizontal line spanning whole x-axis.

use coord_cartesian instead of scale_x_continuous. coord_cartesian sets axis range without affecting how data plotted. coord_cartesian, can still use scale_x_continuous set breaks, coord_cartesian override effect of scale_x_continuous on how data plotted.

in fake data below, note i've added data few small bars.

set.seed(4958) dat = data.frame(value=c(rnorm(5000, 10, 1), rep(15:20,1:6)))  ggplot(dat, aes(value)) +   geom_histogram(binwidth=0.5, color="black", fill="grey") +    theme_bw() +   scale_x_continuous(limits=c(5,25), breaks=5:25) +    ggtitle("scale_x_continuous")  ggplot(dat, aes(value)) +   geom_histogram(binwidth=0.5, color="black", fill="grey") +    theme_bw() +   coord_cartesian(xlim=c(5,25)) +    scale_x_continuous(breaks=5:25) +   ggtitle("coord_cartesian") 

enter image description here

as can see in plots above, if there bins count=0 within data range, ggplot add zero-line, coord_cartesian. makes difficult see bar @ 15 of height=1. can make border thinner lwd argument ("linewidth") smaller bars less obscured:

ggplot(dat, aes(value)) +   geom_histogram(binwidth=0.5, color="black", fill="grey", lwd=0.3) +    theme_bw() +   coord_cartesian(xlim=c(5,25)) +    scale_x_continuous(breaks=5:25) +   ggtitle("coord_cartesian") 

enter image description here

one other option pre-summarise data , plot using geom_bar in order spaces between bars , thereby avoid need border lines mark bar edges:

library(dplyr) library(tidyr) library(zoo)  bins = seq(floor(min(dat$value)) - 1.75, ceiling(max(dat$value)) + 1.25, 0.5)  dat.binned = dat %>%    count(bin=cut(value, bins, right=false)) %>%   # bin data   complete(bin, fill=list(n=0)) %>%              # restore empty bins , fill zeros   mutate(bin = rollmean(bins,2)[-length(bins)])  # convert bin factor numeric value = mean of bin range  ggplot(dat.binned, aes(bin, n)) +   geom_bar(stat="identity", fill=hcl(240,100,30)) +    theme_bw() +   scale_x_continuous(breaks=0:21) 

enter image description here


Comments

Popular posts from this blog

wordpress - (T_ENDFOREACH) php error -

Export Excel workseet into txt file using vba - (text and numbers with formulas) -

Using django-mptt to get only the categories that have items -