Arranging ggplot2 graphs on a page

How to arrange graphs in ggplot2 without the help of the layout matrix

How do you arrange non-simmetric plots in ggplot2?
With the print command:

After installing these two packages: install.packages(“grid”, “ggplot2”), load the  libraries:
library(grid)
library(ggplot2)

The data and code for the three graphs is taken from this website:

# create factors with value labels
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5), labels=c("3gears","4gears","5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1), labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8), labels=c("4cyl","6cyl","8cyl"))

# Kernel density plots for mpg
# grouped by number of gears (indicated by color)
a <- qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),
main="Distribution of Gas Milage", xlab="Miles Per Gallon",
ylab="Density")

# Scatterplot of mpg vs. hp for each combination of gears and cylinders
# in each facet, transmittion type is represented by shape and color
b <- qplot(hp, mpg, data=mtcars, shape=am, color=am,
facets=gear~cyl, size=I(3),
xlab="Horsepower", ylab="Miles per Gallon")

c <- qplot(gear, mpg, data=mtcars, geom=c("boxplot", "jitter"),
fill=gear, main="Mileage by Gear Number",
xlab="", ylab="Miles per Gallon")

a, b, and c are our graphs. Here we decide how to place the plots on the plotting surface:

grid.newpage() # Open a new page on grid device
pushViewport(viewport(layout = grid.layout(3, 1))) #this can really be anything... just remember to change accordingly the print commands below
print(a, vp = viewport(layout.pos.row = 1, layout.pos.col = 1:1))
print(b, vp = viewport(layout.pos.row = 2, layout.pos.col = 1:1))
print(c, vp = viewport(layout.pos.row = 3, layout.pos.col = 1:1))

The layout=grid.layout is the command dividing the plotting surface, in the example I have divided it into three rows and one column, hence the layout.pos.row = 1, 2, 3 and the layout.pos.row = 1:1 equal for all three plots.

image-29-09-2016-at-16-07

What if I need something asymmetrical? For instance two small plots on one column and one taking up more space… The reasoning is very similar to that of the layout matrix: divide the space into 4 squares grid.layout(2, 2) and then plot the third graph over two rows layout.pos.row=1:2

grid.newpage() # Open a new page on grid device
pushViewport(viewport(layout = grid.layout(2, 2))) #this can really be anything... just remember to change accordingly the print commands below
print(a, vp = viewport(layout.pos.row = 1, layout.pos.col = 1:1))
print(b, vp = viewport(layout.pos.row = 2, layout.pos.col = 1:1))
print(c, vp = viewport(layout.pos.row = 1:2, layout.pos.col = 2:2))

image-29-09-2016-at-16-18

A view of Spanish fertility by age groups (with the help of log scales)

I have been working a lot with the demography library in R, it is a great teaching tool for demography, modeling, life tables, graphic visualization of demographic data, and for many other things (see demography ).
There are a lot of examples available using data from the Human Fertility and Mortality Database.
Here I am using data that I have obtained from Spanish Statistics, a fertility rates time series consisting of 5 years age groups (available from download from here).
It is very nice to plot fertility rates by age groups as one can appreciate the changes in fertility occurred over time (in terms of quantum) and how much each age group contributes to fertility. In the case of Spain,.


library(demography)
plot(spain,plot.type="time",xlab="Year",lwd=2)
legend("topright",legend=c("15-19","20-24","25-29",
"30-34","35-39","40-44","45-49"),
col=c("red","yellow","lightgreen","green","lightblue",
"blue","violet"),bty="n",lty=1,cex=0.8,lwd=2)

sp_fert_by5

The very same plot can be obtained through ggplot2 library (given an appropriate theme (see ggplot themes):

ggplot(ddfert, aes(Year, Female, group= Age,col= Age))+
geom_line()+
scale_color_manual(values= c("red", "yellow", "lightgreen", "green","lightblue", "blue", "violet"))+
scale_x_continuous(labels = c(1975, 1985, 1995, 2005, 2015))+
scale_y_continuous("Fertility Rate")

GGPLOTsp_fert_by5.png

I find it often interesting to plot using a log scale, so that small values don’t get compressed to the end of the graph. In this case it would be sufficient to add to the demography code:
plot(spain, plot.type="time", xlab= "Year", lwd=2, transform=T)...
and to ggplot :
ggplot(ddfert, aes(Year, log(Female), group= Age,col= Age))+...

GGPLOTsp_fert_by5LOG.png

The gap between desired and observed fertility in Europe. Part 2: Childlessness levels.

To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

Involuntary childlessness has gained momentum in mainstream media, which attribute a large part (if not the totality) of the blame on the postponement of childbearing: women wait too long to have children, they don’t hear their biological clock ticking and bam! no children. Ever.

Delaying childbearing to later ages has undoubtedly a repercussion on the biological ability to have children, but it is hardly a simple component of the total effect. What the mainstream discussion is often missing on is that the great majority of children are conceived in unions, hence it is a couple’s decision to have children. Indeed, being single is an important if not pivotal deterrent to motherhood, usually delayed until union formation.

This is why it is important to consider factors such as union dissolution risk to appreciate the variation in involuntary childlessness. To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

This is a preview of average population childlessness obtained through simulation using 3 variables: celibacy (%of women ending up single and never entering a union), divorce (%women previously in a union but currently without a partner), and waiting time, the average time spent on contraception at the beginning of a union by a woman who wishes to have children.

childlessness

>ggplot(dt, aes( Age, value, linetype=Variable, col=Variable))+
> geom_line( size=1) +
> scale_color_manual( values=c( "black", "#666666", "grey","black", "#666666", "grey"), guide=guide_legend( nrow=3, byrow=F, title =  "Childlessness" )) +
> xlab("")+
>ylab("")+
>scale_linetype_manual( values=c("solid", "solid",  "solid", "twodash", "dotted", "dashed"), guide=guide_legend( nrow=3, byrow= F, title =  "Childlessness" ))+
>theme( plot.margin= unit(c(1,4,1,1), "cm"), legend.position="bottom", legend.direction= "vertical")

1. ggplot(dt, aes( Age, value, linetype= Variable, col=Variable))

linetype= Variable and col=Variable set in the aes tell ggplot to automatically divide the lines based on the number of Variable(s);

2. scale_color_manual sets the colors of the lines contained in values. I was not satisfied with what I got with scale_color_grey so I set my colors manually (_manual!);

3. since I want the legend at the bottom AND in two columns (or 3 rows) AND I have two features specified in the aes I need to add a guide=guide_legend(nrow=3) to each scale_blablabla_manual (that is to say scale_color_manual AND scale_linetype_manual);

4. In guide=guide_legend the byrow=F means that I do not want the legend to appear ordered by row, but rather by columns;

5. in theme( legend.position=”bottom”) tells ggplot to put the legend below the graph and legend.direction to plot it in a vertical way (which I divide in 3 rows)

1887 crude mortality rate in Spain using classInt package

TBM_1887 jenks
Crude Mortality Rate in Spain, 1887 Census

TBM_1887 quantile TBM_1887 bclust TBM_1887 fisher

>nclassint <- 5 #number of colors to be used in the palette
>cat <- classIntervals(dt$TBM, nclassint,style = "jenks")
>colpal <- brewer.pal(nclassint,"Reds")
>color <- findColours(cat,colpal) #sequential
>bins <- cat$brks
>lb <- length(bins)
>cat

style: jenks
[20.3,25.9] (25.9,30.5] (30.5,34.4] (34.4,38.4] (38.4,58.2]
68         114         130         115          35

Save the categories into a data.frame (dat)

type first second third fourth fifth
1 quantile    91     93    92     91    95
2       sd    10    202   244      5     0
3    equal   100    246   113      2     1
4   kmeans    68    115   142    118    19
5    jenks    68    114   130    115    35
6   hclust   100    174   153     34     1
7   bclust    53    120   275     13     1
8   fisher    68    114   130    115    35

and melt it into a long format (required by ggplot):

dat1 <- melt(dat,id.vars=c("type"),value.name="n.breaks")

ggplot(dat1,aes(x=variable,y=n.breaks,fill=type))+
geom_bar(stat="identity", position=position_dodge())

Rplot

Quick way to add annotations to your ggplot graphs

Lately, some of the graphs I have been working on have “strange/erratic” values, so I thought to plot those values in a different color and rather than adding an extra legend line I have decided to add a note to explain the difference. Among the many options available, I have found a very quick and harmless way to add annotations to ggplot graphs. It uses the library “gridExtra”, which employs user-level functions that work with “grid” graphics and draw tables.

1.load ggplot2 library
library(ggplot2)
2. and save your graph as “my_graph”:
my_graph<- qplot(wt, mpg, data = mtcars)
3. load gridExtra and add the text to the graph. Note that x, hjust and vjust give the position of the text in the outer margins. If you want to annotate INSIDE the graph, use annotate:
library(gridExtra)
g <- arrangeGrob(p, sub = textGrob("I pledge my life and honor to the Night's Watch, \nfor this night and all the nights to come.", x=0, hjust=-0.1, vjust=0.1,gp = gpar(fontface = "italic", fontsize = 10)))

5. save the graph
ggsave("my_graph_with_note.pdf", g, width=5,height=5)

my_graph_with_note

 

 

 

 

 

here is the graph I have been working on:

MORAN_I_MAC

Valar Morghulis: Some charts using GOT (tv-show) deaths

Drawing from one of the most important demographic laws, Valar Morghulis (all men must die), here is a simple summary of the deadly happenings in four seasons of GOT as reported by the Washington Post.

Let’s start by the total number of (portrayed) deaths by season:

df1 ggplot(df1,aes(x=factor(Series),y=Total))+
geom_bar(stat="identity",fill=c("yellow","orange","red","brown"))+
xlab("Season number")+
ylab("Total number of deaths")

Number of deaths by season box-plot
Number of deaths by season

ggplot(df1,aes(x=Series,y=Total))+
geom_line(lwd=2)+
xlab("Season number")+
ylab("Total number of deaths")

Number of deaths by season

by location in Westeros:

df2 Location=c("King's Landing","Beyond the Wall","Castle Black","The Twins","The Riverlands")
ggplot(df2,aes(x=factor(Location),y=Deaths))+
geom_bar(stat="identity",fill=c("lightblue","black","brown","darkseagreen","red"))+
ylab("Total number of deaths")+
xlab("")+
theme(axis.text=element_text(size=15))

Number of deaths by location

by method of death:
df3 Method=c("Animal","Animal Death","Arrows","Axe","Blade","Bludgeon","Crushing","Falling","Fire","Hands","HH item","Mace","Magic","Other","Poison","Spear","Unknown")
df3.1 df3.2 ggplot(df3.2,aes(x=factor(Method),y=value,fill=variable))+
geom_bar(stat="identity")+
ylab("")+
xlab("")+
theme(axis.text.x=element_text(size=15,angle=45))+
scale_fill_discrete(name ="Method of Death", labels=c("Season 1", "Season 2", "Season 3", "Season 4"))

Number of deaths by method
and lastly by House allegiance:
df4 House df4.1 df4.2 ggplot(df4.2,aes(x=reorder(factor(House),value),y=value,fill=variable))+
geom_bar(stat="identity")+
ylab("")+
xlab("")+
theme(axis.text.x=element_text(size=15,color="black"),
axis.text.y=element_text(size=15,color="black"))+
scale_fill_discrete(name ="House Allegiance", labels=c("Season 1", "Season 2", "Season 3", "Season 4"))+
coord_flip()

Number of deaths by house

Pyramid-like bar chart for climate change barriers

I was scrolling through the Independent and got hooked on a graph displaying the percentage of people’s concerns regarding climate change by country, and was extremely surprised by the results. UK and US lag far behind countries including China in wanting their governments to pursue a meaningful commitment to successfully address climate change.

newplot

library(ggplot2)
library(grid)
library(plyr)

dta<-
structure(list(country = structure(c(15L, 3L, 4L, 5L, 14L, 6L,
10L, 12L, 1L, 2L, 7L, 8L, 9L, 11L, 13L, 15L, 3L, 4L, 5L, 14L,
6L, 10L, 12L, 1L, 2L, 7L, 8L, 9L, 11L, 13L), .Label = c("Australia",
"China", "Denmark", "Finland", "France", "Germany", "Hong Kong",
"Indonesia", "Malaysia", "Norway", "Singapore", "Sweden", "Thailand",
"UK", "US"), class = "factor"), issue = c("Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage who think climate \nchange is 'not a serious problem' ",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change",
"Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change"
), perc = c(32L, 14L, 23L, 10L, 26L, 11L, 22L, 18L, 11L, 4L,
5L, 3L, 2L, 5L, 6L, -17L, -4L, -8L, -3L, -7L, -4L, -10L, -8L,
-3L, -1L, -1L, -1L, -1L, -1L, -1L)), .Names = c("country", "issue",
"perc"), row.names = c(NA, -30L), class = "data.frame")

p <- ggplot(dta, aes(reorder(country,perc),perc,fill=issue)) +
geom_bar(subset = .(issue == "Percentage who think climate \nchange is 'not a serious problem' "), stat = "identity",colour="black",alpha=0.5) +
annotate("text",x = 16.5, y = -12,label=dta$issue[16], fontface="bold")+
geom_bar(subset = .(issue == "Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change"),colour="black", stat = "identity",alpha=0.5) +
annotate("text",x = 16.5, y = 15,label=dta$issue[1], fontface="bold")+
scale_fill_manual(values = c("#F7320B", "#2BC931"))+
geom_text(subset = .(issue == "Percentage who think climate \nchange is 'not a serious problem' "),
aes(label=perc.a), position="dodge", hjust=-.35)+
geom_text(subset = .(issue == "Percentage that want their country's strategy not to agree \nto any international agreement that addresses climate change"),colour="black", stat = "identity",aes(label=perc.b), position="dodge", hjust=2)+
coord_flip() +
xlab("")+
ylab("")+
scale_x_discrete(expand=c(0.2,0.55))+
scale_y_continuous(limits=c(-22,32),
breaks = c(-17,-10,0,10,32),
labels = paste0(as.character(c(17,10,0,10,32), "%")))+
theme(axis.text.y  = element_text(size=13,hjust=1),
axis.text = element_text(colour = "black"),
plot.background = element_blank(),
panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
axis.ticks = element_blank(),
axis.text.x = element_blank(),
legend.background =element_rect("white"),
legend.position="none",
strip.background = element_rect(fill = "white", colour = "white"),
strip.text.x = element_text(size = 13))

ggsave("newplot.pdf",p,scale=2)