A view of Spanish fertility by age groups (with the help of log scales)

I have been working a lot with the demography library in R, it is a great teaching tool for demography, modeling, life tables, graphic visualization of demographic data, and for many other things (see demography ).
There are a lot of examples available using data from the Human Fertility and Mortality Database.
Here I am using data that I have obtained from Spanish Statistics, a fertility rates time series consisting of 5 years age groups (available from download from here).
It is very nice to plot fertility rates by age groups as one can appreciate the changes in fertility occurred over time (in terms of quantum) and how much each age group contributes to fertility. In the case of Spain,.


library(demography)
plot(spain,plot.type="time",xlab="Year",lwd=2)
legend("topright",legend=c("15-19","20-24","25-29",
"30-34","35-39","40-44","45-49"),
col=c("red","yellow","lightgreen","green","lightblue",
"blue","violet"),bty="n",lty=1,cex=0.8,lwd=2)

sp_fert_by5

The very same plot can be obtained through ggplot2 library (given an appropriate theme (see ggplot themes):

ggplot(ddfert, aes(Year, Female, group= Age,col= Age))+
geom_line()+
scale_color_manual(values= c("red", "yellow", "lightgreen", "green","lightblue", "blue", "violet"))+
scale_x_continuous(labels = c(1975, 1985, 1995, 2005, 2015))+
scale_y_continuous("Fertility Rate")

GGPLOTsp_fert_by5.png

I find it often interesting to plot using a log scale, so that small values don’t get compressed to the end of the graph. In this case it would be sufficient to add to the demography code:
plot(spain, plot.type="time", xlab= "Year", lwd=2, transform=T)...
and to ggplot :
ggplot(ddfert, aes(Year, log(Female), group= Age,col= Age))+...

GGPLOTsp_fert_by5LOG.png

Advertisements

The gap between desired and observed fertility in Europe. Part 2: Childlessness levels.

To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

Involuntary childlessness has gained momentum in mainstream media, which attribute a large part (if not the totality) of the blame on the postponement of childbearing: women wait too long to have children, they don’t hear their biological clock ticking and bam! no children. Ever.

Delaying childbearing to later ages has undoubtedly a repercussion on the biological ability to have children, but it is hardly a simple component of the total effect. What the mainstream discussion is often missing on is that the great majority of children are conceived in unions, hence it is a couple’s decision to have children. Indeed, being single is an important if not pivotal deterrent to motherhood, usually delayed until union formation.

This is why it is important to consider factors such as union dissolution risk to appreciate the variation in involuntary childlessness. To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

This is a preview of average population childlessness obtained through simulation using 3 variables: celibacy (%of women ending up single and never entering a union), divorce (%women previously in a union but currently without a partner), and waiting time, the average time spent on contraception at the beginning of a union by a woman who wishes to have children.

childlessness

>ggplot(dt, aes( Age, value, linetype=Variable, col=Variable))+
> geom_line( size=1) +
> scale_color_manual( values=c( "black", "#666666", "grey","black", "#666666", "grey"), guide=guide_legend( nrow=3, byrow=F, title =  "Childlessness" )) +
> xlab("")+
>ylab("")+
>scale_linetype_manual( values=c("solid", "solid",  "solid", "twodash", "dotted", "dashed"), guide=guide_legend( nrow=3, byrow= F, title =  "Childlessness" ))+
>theme( plot.margin= unit(c(1,4,1,1), "cm"), legend.position="bottom", legend.direction= "vertical")

1. ggplot(dt, aes( Age, value, linetype= Variable, col=Variable))

linetype= Variable and col=Variable set in the aes tell ggplot to automatically divide the lines based on the number of Variable(s);

2. scale_color_manual sets the colors of the lines contained in values. I was not satisfied with what I got with scale_color_grey so I set my colors manually (_manual!);

3. since I want the legend at the bottom AND in two columns (or 3 rows) AND I have two features specified in the aes I need to add a guide=guide_legend(nrow=3) to each scale_blablabla_manual (that is to say scale_color_manual AND scale_linetype_manual);

4. In guide=guide_legend the byrow=F means that I do not want the legend to appear ordered by row, but rather by columns;

5. in theme( legend.position=”bottom”) tells ggplot to put the legend below the graph and legend.direction to plot it in a vertical way (which I divide in 3 rows)

Valar Morghulis: Some charts using GOT (tv-show) deaths

Drawing from one of the most important demographic laws, Valar Morghulis (all men must die), here is a simple summary of the deadly happenings in four seasons of GOT as reported by the Washington Post.

Let’s start by the total number of (portrayed) deaths by season:

df1 ggplot(df1,aes(x=factor(Series),y=Total))+
geom_bar(stat="identity",fill=c("yellow","orange","red","brown"))+
xlab("Season number")+
ylab("Total number of deaths")

Number of deaths by season box-plot
Number of deaths by season

ggplot(df1,aes(x=Series,y=Total))+
geom_line(lwd=2)+
xlab("Season number")+
ylab("Total number of deaths")

Number of deaths by season

by location in Westeros:

df2 Location=c("King's Landing","Beyond the Wall","Castle Black","The Twins","The Riverlands")
ggplot(df2,aes(x=factor(Location),y=Deaths))+
geom_bar(stat="identity",fill=c("lightblue","black","brown","darkseagreen","red"))+
ylab("Total number of deaths")+
xlab("")+
theme(axis.text=element_text(size=15))

Number of deaths by location

by method of death:
df3 Method=c("Animal","Animal Death","Arrows","Axe","Blade","Bludgeon","Crushing","Falling","Fire","Hands","HH item","Mace","Magic","Other","Poison","Spear","Unknown")
df3.1 df3.2 ggplot(df3.2,aes(x=factor(Method),y=value,fill=variable))+
geom_bar(stat="identity")+
ylab("")+
xlab("")+
theme(axis.text.x=element_text(size=15,angle=45))+
scale_fill_discrete(name ="Method of Death", labels=c("Season 1", "Season 2", "Season 3", "Season 4"))

Number of deaths by method
and lastly by House allegiance:
df4 House df4.1 df4.2 ggplot(df4.2,aes(x=reorder(factor(House),value),y=value,fill=variable))+
geom_bar(stat="identity")+
ylab("")+
xlab("")+
theme(axis.text.x=element_text(size=15,color="black"),
axis.text.y=element_text(size=15,color="black"))+
scale_fill_discrete(name ="House Allegiance", labels=c("Season 1", "Season 2", "Season 3", "Season 4"))+
coord_flip()

Number of deaths by house