Tag Archives: ggplot2

butteRfly

Float like a butterfly, sting like a bee (Muhammad Ali)

The Butterfly Curve was discovered by Temple H. Fay when he was in Southern University, Mississippi, and rapidly gained the attention of students and mathematicians because of its beautiful simmetry. Small dots of this plot are generated according to parametric equations of the Butterfly Curve. Big dots are randomdly distributed over the canvas:

butterfly4

This is the code to create butterflies:

library(ggplot2)
npoints=500
npointsb=1200
opt=theme(legend.position="none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text =element_blank())
t=seq(0,10*pi,length=npointsb)
butterfly=data.frame(x=sin(t)*(exp(1)^cos(t)-2*cos(4*t)-(sin(t/12))^5), y=cos(t)*(exp(1)^cos(t)-2*cos(4*t)-(sin(t/12))^5), s=runif(npointsb, min=.1, max=10), f=factor(sample(1:10,npointsb,TRUE)), a=runif(npointsb,min=.1, max=.4))
points=data.frame(x=runif(npoints,-4,4), y=runif(npoints,-3,5), s=runif(npoints,min=30, max=50), f=factor(sample(1:10,npoints,TRUE)), a=runif(npoints,min=.05, max=.15))
data=rbind(butterfly, points)
ggplot(data, aes(x, y, colour=f))+geom_point(alpha=data$a,size=data$s)+opt

Batman’s Choice

A hero can be anyone, even a man doing something as simple and reassuring as putting a coat on a young boy’s shoulders to let him know the world hadn’t ended (Batman in The Dark Knight Rises)

Joker has captured Batman and keeps him into a dark and cold dungeon of Gotham City. Showing his sadistic character, Joker proposes the following game to Batman:

This is a six shooter revolver with two bullets in the cylinder. Bullets are inside two consecutive chambers. I will spin the cylinder and will fire the gun aiming to my head. If I survive you will have to do the same but you decide if you want to spin the cylinder before firing or not. If you still keep you head over your shoulders after firing, you will be free.

Joker fires and nothing happens. He survives and passes the revolver to Batman. What should Batman do? Spinning or not? What would you do?

From my point of view, answer is quite anti-intutive because the best option is not spinning the cylinder again. Spinning case is clear: probability of losing the head is 2/6=33% but what about not spinning? Doing the next shoot directly eliminates two possibilities: the previous shot of Joker and the second bullet according to direction of cylinder rotation (remember two bullets are consecutive and Joker is unfortunately  still alive). It means there is only 1 chance to dead between 4, so probability of losing the head in this scenario is 1/4=25% which is significantly lower than the first one.

Here you can find the resulting graph of simulating the game up to 500 times:

batmans_choice2

Will it be the end of Batman? Not sure.

This is the code of this experiment:

library(ggplot2)
library(extrafont)
niter=500
results=data.frame()
for (i in 1:niter)
{
bullet1=sample(1:6,1)
Joker=sample((1:6)[-c(bullet1, bullet1%%6+1)],1)
#Option 1: Shooting
Batman1=Joker%%6+1
dead1=(Batman1 %in% c(bullet1, bullet1%%6+1))*1
#Option 2: Rolling and Shooting
Batman2=sample(1:6,1)
dead2=(Batman2 %in% c(bullet1, bullet1%%6+1))*1
results=rbind(results, c(i, dead1, dead2))
}
colnames(results)=c("iter", "dead1", "dead2")
results$csum1=cumsum(results$dead1)/as.numeric(rownames(results))
results$csum2=cumsum(results$dead2)/as.numeric(rownames(results))
theme_xkcd=theme(
panel.background = element_rect(fill="darkolivegreen1"),
panel.border = element_rect(colour="black", fill=NA),
axis.line = element_line(size = 0.5, colour = "black"),
axis.ticks = element_line(colour="black"),
panel.grid = element_line(colour="white", linetype = 2),
axis.text.y = element_text(colour="black"),
axis.text.x = element_text(colour="black"),
text = element_text(size=18, family="Humor Sans"),
plot.title = element_text(size = 50)
)
p=ggplot(data=results, aes(x=iter, y=csum1))+
geom_abline(intercept = 1/4, slope = 0, size = 0.4, linetype=2, colour = "black", alpha=0.8)+
geom_abline(intercept = 1/3, slope = 0, size = 0.4, linetype=2, colour = "black", alpha=0.8)+
geom_line(aes(y=csum2), colour="green4", size=1.5, fill=NA)+
geom_line(colour="green4", size=1.5, fill=NA)+
coord_cartesian(ylim=c(.0, 1), xlim=c(1, niter))+
scale_y_continuous(breaks = c(0,round(1/4, digits = 2),round(1/3, digits = 2),1))+
geom_text(data=results[niter*.75, ], family="Humor Sans", colour="green4", y=0.38, label="Rotating Cylinder and Shooting ...", size=4, adjust=1)+
geom_text(data=results[niter*.75, ], family="Humor Sans", colour="green4", y=0.20, label="Shooting without Rotating Cylinder ...", size=4, adjust=1)+
labs(x="Number Of Trials", y="Prob. of Losing The Head", title="Batman's Choice")+
theme_xkcd
ggsave("batmans_choice.jpg", plot=p, width=8, height=5)

How To Approximate Pi With A Short Pencil And A Big Paper

Experiment, be curious: though interfering friends may frown, get furious at each attempt to hold you down (Tony Bennett, Experiment)

Instructions:

  1. Take a pencil and measure it
  2. Take a piece of paper and draw parallel lines on it (you can use the pencil, of course); separation between lines should double the length of the pencil
  3. Toss the pencil over the paper 100 times (or more)
  4. Make note of how many times do the pencil cross some of the lines
  5. Calculate ratio between tosses and crosses: this is your approximation of Pi

Some time ago, I published a post about one of the most amazing places where PI was discovered. This is another example of the ubiquity of this mathematical constant. This experiment is based on Buffon’s needle problem, another amazing experiment of 18th century. Next plot represents ratio of tosses to crosses depending on the length of pencil. When the pencil is half the length of the separation between lines, the previous ratio is approximately PI:
Buffon
If you get very bored some afternoon you can replicate this experiment with your children. Use a short pencil. If not, you will need an extremely big piece of paper. Meanwhile here you have the code:

trials=100000
results=sapply(seq(.1, 2, by = .05), function(x)
{
  r=x #Length of pencil in relation to separation between lines
  Needles=t(sapply(1:trials, function(y) c(100*runif(1),2*pi*runif(1))))
  Needles=cbind(Needles,Needles[,1]+r*cos(Needles[,2]))
  Needles=data.frame(x1=Needles[,1], x2=Needles[,3], Cross=abs(trunc(Needles[,1])-trunc(Needles[,3])))
  c(r, trials/(trials-nrow(Needles[Needles$Cross==0,])))
})
results=data.frame(t(results))
colnames(results)=c(c("ratio", "inv.perc.crosses"))
require(ggplot2)
require(extrafont)
require(grid)
opts=theme(
  panel.background = element_rect(fill="darkolivegreen1"),
  panel.border = element_rect(colour="black", fill=NA),
  axis.line = element_line(size = 0.5, colour = "black"),
  axis.ticks = element_line(colour="black"),
  panel.grid.major = element_line(colour="white", linetype = 1),
  panel.grid.minor = element_blank(),
  axis.text.y = element_text(colour="black", size=20),
  axis.text.x = element_text(colour="black", size=20),
  text = element_text(size=25, family="xkcd"),
  legend.key = element_blank(),
  legend.position = c(.2,.75),
  legend.background = element_blank(),
  plot.title = element_text(size = 50))
c=ggplot(results, aes(ratio, inv.perc.crosses))
c +geom_abline(intercept = pi, slope = 0, size = 0.4, linetype=2, colour = "black", alpha=0.8)+
  geom_line(color="green4", size=1.5)+
  geom_point(color="gray92", size=8, pch=16)+
  geom_point(color="green4", size=6, pch=16)+
  geom_text(aes(0.55, 5), hjust=0, family="xkcd", label="PI is around here!", size=10)+
  ggtitle("Hot to approximate PI with \na short pencil and a big paper")+
  xlab("Length of pencil divided by separation between lines") +
  ylab("Number of tosses divided by number of crosses")+
  geom_segment(aes(x = .625, y = 4.6, xend = .52, yend = 3.45), size=1.5, arrow = arrow(length = unit(0.5, "cm")))+
  scale_y_continuous(breaks=c(pi, 4, 8, 12, 16), labels=c("3.141593","4","8","12","16"))+
  scale_x_continuous(breaks=seq(.1, 2, by = .1))+opts

The Gilbreath’s Conjecture

317 is a prime, not because we think so, or because our minds are shaped in one way rather than another, but because it is so, because mathematical reality is built that way (G.H. Hardy)

In 1958, the mathematician and magician Norman L. Gilbreath presented a disconcerting hypothesis conceived in the back of a napkin. Gilbreath wrote first prime numbers in a row. In the next rows, he wrote the difference  in absolute value of consecutive values of previous row. Beginning with first 20 primes in the first row, he obtained something like this:Gilbreath01

The conjecture is easy: except for the first one, all elements of the first column are 1. So far, no one has demonstrated this hypothesis. In fact, according to mathematician Richard Guy, it seems unlikely that we will see a demonstration of Gilbreath’s conjecture in the near future, though probably this conjecture is true.

In the previous chart, I coloured zeros in white, ones in violet and rest of numbers in gold. The conjecture says that except for the first element, the first column is entirely violet. Following you can see the coloured chart for first 20, 40, 60 and 80 prime numbers:

Gilbreath02

It is nice how zeros create triangular patterns similar to patterns created by cellular automata. This is the chart for first 200 primes:Gilbreath03

How much time will it take to demonstrate this simple conjecture? Who knows. Meanwhile, you can draw triangles with this code:

library(ggplot2)
create.gilbreath=function(n)
{
require(reshape)
require(schoolmath)
data(primlist)
gilbreath=t(matrix(primlist[2:(n+1)], n, n))
for (i in 2:n) {gilbreath[i,]=c(eval(parse(text=paste(c(paste("abs(diff(",collapse=""), "gilbreath[i-1,]", paste("))",collapse="")),collapse=""))),rep(NA,1))}
na.omit(melt(t(gilbreath)))
}
opt=theme(legend.position="none",
panel.background = element_blank(),
panel.grid = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text =element_blank())
gilbreath=create.gilbreath(20)
gilbreath$value1=cut(gilbreath$value, breaks=c(-Inf,1:2,Inf), right = FALSE)
ggplot(gilbreath, aes(x=X1, y=X2)) +
geom_tile(aes(fill = value1), colour="grey") +
scale_fill_manual(values = c("white", "darkviolet", "gold"))+
geom_text(label=gilbreath$value, size=8)+
scale_y_reverse()+opt

The Three Little Pigs

Jesse, you asked me if I was in the meth business or the money business. Neither. I’m in the empire business (Walter White in Breaking Bad)

The game of pig has simple rules but complex strategies. It was described for the first time in 1945  by a magician called John Scarne. Playing the pig game is easy: each turn, a player repeatedly rolls a die until either a 1 is rolled or the player decides to hold:

  • If the player rolls a 1, they score nothing and it becomes the next player’s turn
  • If the player rolls any other number, it is added to their turn total and the player’s turn continues
  • If a player chooses to hold, their turn total is added to their score, and it becomes the next player’s turn

The first player who reach at least 100 points is the winner. For example: you obtain a 3 and then decide to roll again, obtaining a 1. Your score is zero in this turn. Next player gets the sequence 3-4-6 and decides to hold, obtaining a score of 13 points in this turn.

Despite its simplicity, the pig game has a very complex and against-intuition optimal strategy. It was calculated in 2004 by Todd W. Neller and Clifton Presser from Gettysburg College of Pennsilvania with the help of computers.

To illustrate the game, I simulated three players (pigs) playing the pig game with three different strategies:

  • The Coward pig, who only rolls the die a small number of times in every turn
  • The Risky pig, who rolls the die a more times than the coward one
  • The Ambitious pig, who tries to obtain in every turn more points than two others

I simulated several scenarios.

  • Some favorable scenarios for Coward pig:

pigs2Coward pigs4Coward

In first scenario, the Coward pig rolls the die between 1 and 5 times each round and wins if the Risky pig asumes an excessive level of risk (rolling each time between 10 and 15 times). Trying to obtain more than the Coward is a bad option for the Ambitious pig. Simulating this scenario 100 times gives victory to Coward a 51% of times (25% to Risky and 24% to Ambitious).

Second scenario puts closer Coward and Risky pigs (first one rolls the die between 4 and 7 times  each round and second one between 6 and 9 times). Coward wins 54% of times (34% Risky and only 12% Ambitious).

Being coward seems to be a good strategy when you play against a reckless or when you are just a bit more conservative than a Risky one.

  • Some favorable scenarios for Risky pig:

pigs5RiskyPigs1Risky

Rolling the die between 4 and 6 times each round seems to be a good option, even more when you are playing against a extremely conservative player who rolls no more than 3 times each time. Simulating 100 times these previous scenarios gives victory to Risky pig a 58% of times in first the case in which Coward rolls allways 1 and Risky 6 times each round (0% for Coward and only 42% form Ambitious) and 66% of times in the second one (only 5% to Coward and 29% to Ambitious).

Being Risky is a good strategy when you play against a chicken.

  • Some favorable scenarios for Ambitious pig:

pigs3Ambitious pigs6ambitious

The Ambitious pig wins when two others turn into extremely coward and risky pigs as can be seen in the first scenario in which Ambitious wins 65% of the times (31% for Coward and 4% for Risky). Ambitious pig also wins when two others get closer and hit the die a small number of times (2 rolls the Coward and 4 rolls the Risky). In this scenario the Ambitious wins 58% of times (5% for Coward and 37% for Risky). By the way, these two scenarios sound very unreal.

Being ambitious seems to be dangerous but works well when you play against a crazy and a chicken or against very conservative players.

From my point of view, this is a good example to experiment with simulations, game strategies and xkcd style graphics.

The code:

require(ggplot2)
require(extrafont)
#Number of hits for Coward
CowardLower=2
CowardUpper=2
#Number of hits for Risky
RiskyLower=4
RiskyUpper=4
game=data.frame(ROUND=0, part.p1=0, part.p2=0, part.p3=0, Coward=0, Risky=0, Ambitious=0)
while(max(game$Coward)<100 & max(game$Risky)<100 & max(game$Ambitious)<100)
{
#Coward Little Pig
p1=sample(1:6,sample(CowardLower:CowardUpper,1), replace=TRUE)
s1=min(min(p1-1),1)*sum(p1)
#Risky Little Pig
p2=sample(1:6,sample(RiskyLower:RiskyUpper,1), replace=TRUE)
s2=min(min(p2-1),1)*sum(p2)
#Ambitious Little Pig
s3=0
repeat {
p3=sample(1:6,1)
s3=(p3+s3)*min(min(p3-1),1)
if (p3==1|s3>max(s1,s2)) break
}
game[nrow(game)+1,]=c(max(game$ROUND)+1,s1,s2,s3,max(game$Coward)+s1,max(game$Risky)+s2,max(game$Ambitious)+s3)
}
opts=theme(
panel.background = element_rect(fill="darkolivegreen1"),
panel.border = element_rect(colour="black", fill=NA),
axis.line = element_line(size = 0.5, colour = "black"),
axis.ticks = element_line(colour="black"),
panel.grid.major = element_line(colour="white", linetype = 1),
panel.grid.minor = element_blank(),
axis.text.y = element_text(colour="black"),
axis.text.x = element_text(colour="black"),
text = element_text(size=25, family="xkcd"),
legend.key = element_blank(),
legend.position = c(.2,.75),
legend.background = element_blank(),
plot.title = element_text(size = 50)
)
ggplot(game, mapping=aes(x=game$ROUND, y=game$Coward)) +
geom_line(color="red", size=1.5) +
geom_line(aes(x=game$ROUND, y=game$Risky), color="blue", size=1.5) +
geom_line(aes(x=game$ROUND, y=game$Ambitious), color="green4", size=1.5) +
geom_point(aes(x=game$ROUND, y=game$Coward, colour="c1"), size=5.5) +
geom_point(aes(x=game$ROUND, y=game$Risky, colour="c2"), size=5.5) +
geom_point(aes(x=game$ROUND, y=game$Ambitious, colour="c3"), size=5.5) +
ggtitle("THE THREE LITTLE PIGS") +
xlab("ROUND") + ylab("SCORING") +
geom_text(aes(max(game$ROUND), max(max(game$Coward, game$Risky, game$Ambitious)), hjust=1.2, family="xkcd", label="WINNER!"), size=10)+
geom_hline(yintercept=100, linetype=2, size=1)+
scale_y_continuous(breaks=seq(0, max(max(game$Coward, game$Risky, game$Ambitious))+10, 10))+
scale_x_continuous(breaks=seq(0, max(game$ROUND), 1))+
scale_colour_manual("",
labels = c(paste("Coward: ", CowardLower, "-", CowardUpper, " hits", sep = ""), paste("Risky: ", RiskyLower, "-", RiskyUpper, " hits", sep = ""), "Ambitious"),
breaks = c("c1", "c2", "c3"),
values = c("red", "blue", "green4"))+ opts

Shakespeare Is More Monkey-Friendly Than Cervantes

Ford, there is an infinite number of monkeys outside who want to talk to us about this script for Hamlet they have worked out (from Episode 2 of The Hitchhiker’s Guide to the Galaxy by Douglas Adams)

Some days ago I was talking with a friend about the infinite monkey theorem which is a funny interpretation of what thinking-in-infinite can produce. The same day, in my weekly English class, my teacher said that Anglo-saxon words do tend to be short, very often monosyllabic such as function words such as to, of, from etc and everyday words such as go, see run, eat, etc.

Both things made me think that a monkey could have easier to type a Shakespeare text rather than a Cervantes one. I cannot imagine a definitive way to demonstrate this but this experiment support my hypothesis. After simulating random words of 2, 3, 4 and 5 characters I look for them in English(1) and Spanish(2) dictionaries, which I previously downloaded from here. Result: I find more random words in the English one. These are the results of my experiment:monkey_typewriter

For example, around 38% of two-chars words match with English dictionary and only 9% with Spanish one. This is why I think that, in the infinite, I would be easier for a monkey to replicate a Shakespeare text than a Cervantes one.

Here you have the code:

library(ggplot2)
library(scales)
esp.dic=data.frame(LANG="ESP", WORD=readLines("ES.dic"))
eng.dic=data.frame(LANG="ENG", WORD=readLines("UK.dic"))
df.lang=do.call("rbind", list(esp.dic, eng.dic))
df.lang$WORD=tolower(iconv(df.lang$WORD, to="ASCII//TRANSLIT"))
df.lang=unique(df.lang)
results=data.frame(LANG=character(0), OCCURRENCES=numeric(0), SIZE=numeric(0), LENGTH=numeric(0))
for (i in 2:5)
{
df.monkey=data.frame(WORD=replicate(20000, paste(sample(c(letters), i, replace = TRUE), collapse='')))
results=rbind(results, data.frame(setNames(aggregate(WORD ~ ., data = merge(df.lang, df.monkey, by="WORD"), FUN=length), c("LANG","OCCURRENCES")), SIZE=20000, LENGTH=i))
}
opt=theme(panel.background = element_rect(fill="gray92"),
panel.grid.minor = element_blank(),
panel.grid.major.x = element_blank(),
panel.grid.major.y = element_line(color="white", size=1.5),
plot.title = element_text(size = 35),
axis.title = element_text(size = 20, color="gray35"),
axis.text = element_text(size=16),
axis.ticks = element_blank(),
axis.line = element_line(colour = "white"))
ggplot(data=results, aes(x=LENGTH, y=OCCURRENCES/SIZE, colour=LANG))+
geom_line(size = 2)+
scale_colour_discrete(guide = FALSE) +
geom_point(aes(fill=LANG),size=10, colour="gray92",pch=21)+
scale_x_continuous("word length", labels=c("two chars", "three chars", "four chars", "five chars"))+
scale_y_continuous("probability of existence", limits=c(0, 0.4), labels = percent)+
labs(title = "What if you put a monkey in front of a typewriter?")+
opt + scale_fill_discrete(name="Dictionary", breaks=c("ESP", "ENG"), labels=c("Spanish", "English"))

(1) The English dictionary was originally compiled from public domain sources
for the amSpell spell-checker by Erik Frambach e-mail: e.h.m.frambach@eco.rug.nl
(2) The Spanish dictionary has been elaborated by Juan L. Varona, Dpto. de Matematicas y Computacion, Universidad de La Rioja, Calle Luis de Ulloa s/n, 26004 SPAIN e-mail: jvarona@siur.unirioja.es

What If You Dig A Hole Through The Earth?

It suddenly struck me that that tiny pea, pretty and blue, was the Earth. I put up my thumb and shut one eye, and my thumb blotted out the planet Earth. I didn’t feel like a giant. I felt very, very small (Neil Armstrong)

Where would you come out if you dig a hole straight downward from where you live through the center of the Earth? Supposing you survive to the extremely high pressure and temperature of the nucleus, would you find water or land at the other side? It maybe sound a ridiculous question (I will not refute that) but in this post I will estimate how many people would die drowned and how many will find land in the antipode of where they live. At least, knowledge does not take up any space.

I found a database of the United Nations with very useful information for my experiment: longitude, latitude and population of all capital cities of the world in 2011(1). I assumed that capital cities are a good sample of where people live. Maybe is not the best one since some very big countries are represented by only a city but is a good way to obtain a quick estimation. On the other hand, capital cities represent approximately 7% of the world population so in this sense is a very good sample.

Process is simple: loading the xls file, calculating the antipode of each point and checking where it is. Google provides information about country where a coordinate belongs. For coordinates on the sea, no information is returned. Once you have this, is easy to calculate proportion of people that will find water. My estimation is around 77% of people will find water on the other side. Taking into account that all people leave from land and approximately 70% of the Earth’s surface is water, this figure seems to be small but since both poles are symmetrical and are uninhabited, the estimation makes sense. Here you have a map with the result of the experimet. Points are capital cities and size is related with population. In blue, capitals with antipode on the sea and in brown, capitals with antipode in land:
WorldMapR
By the way, I am one of the 23% of lucky people that would find land in the other side. I live in Madrid, Spain:
madrid
An if some rainy afternoon having little to do I dig a hole through the Earth, I will appear in a place called Weber, in New Zealand:
weber
This estimation can be very silly but the physics involved in the experiment are very interesting as you can see here. By the way, there is a film called Total Recall (2012) where the only way to travel between last two cities in the world is using an elevator through the Earth. Here you have the code:

library(xlsx)
library(ggmap)
library(mapdata)
library(ggplot2)
#The xls file is in http://esa.un.org/unpd/wup/CD-ROM/WUP2011-F13-Capital_Cities.xls
CapitalCities <- read.xlsx("WUP2011-F13-Capital_Cities.xls", sheetName="Capital_Cities", startRow=13,header=TRUE)
names(CapitalCities) = gsub("\\.", "", names(CapitalCities))
#Obtain symmetric coordinates for each capital
CapitalCities$LatitudeSym <- -CapitalCities$Latitude
CapitalCities$LongitudeSym <- -sign(CapitalCities$Longitude)*(180-abs(CapitalCities$Longitude))
CapitalCities$DigResult <- apply(CapitalCities, 1, function(x) {unlist(revgeocode(c(as.numeric(x[11]),as.numeric(x[10]))))})
CapitalCities$Drowned <- is.na(CapitalCities$DigResult)*1
#Percentage of population saved
sum(CapitalCities$Drowned*CapitalCities$Populationthousands)/sum(CapitalCities$Populationthousands)
world <- map_data("world")
opt <- theme(legend.position="none",
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text =element_blank(),
plot.title = element_text(size = 35),
panel.background = element_rect(fill="turquoise1"))
p <- ggplot()
p <- p + geom_polygon(data=world, aes(x=long, y=lat, group = group),colour="white", fill="lightgoldenrod2" )
p <- p + geom_point(data=CapitalCities, aes(x=Longitude, y=Latitude, color=Drowned, size = Populationthousands)) + scale_size(range = c(2, 20), name="Population (thousands)")
p <- p + labs(title = "What if you dig a hole through the Earth?")
p <- p + scale_colour_gradient(low = "brown", high = "blue")
p <- p + annotate("rect", xmin = -135, xmax = -105, ymin = -70, ymax = -45, fill = "white")
p <- p + annotate("text", label = "Drowned", x = -120, y = -60, size = 6, colour = "blue")
p <- p + annotate("text", label = "Saved", x = -120, y = -50, size = 6, colour = "brown")
p <- p + geom_point(aes(x = -120, y = -65), size=8, colour="blue")
p <- p + geom_point(aes(x = -120, y = -55), size=8, colour = "brown")
p + opt
# Get a map of Spain, centered and signed in Madrid
madrid <- geocode('Madrid, Spain')
map.madrid <- get_map( location = as.numeric(madrid), color = "color", maptype = "roadmap", scale = 2, zoom = 6)
ggmap(map.madrid) + geom_point(aes(x = lon, y = lat), data = madrid, colour = 'red', size = 4)
# Get a map of New Zealand, centered and signed in Weber (the antipode of Madrid)
weber <- geocode('Weber, New Zealand')
map.weber <- get_map( location = as.numeric(weber), color = "color", maptype = "roadmap", scale = 2, zoom = 6)
ggmap(map.weber) + geom_point(aes(x = lon, y = lat), data = weber, colour = 'red', size = 4)

(1) United Nations, Department of Economic and Social Affairs, Population Division (2012). World Urbanization Prospects: The 2011 Revision, CD-ROM Edition.

The Collatz Fractal

It seems to me that the poet has only to perceive that which others do not perceive, to look deeper than others look. And the mathematician must do the same thing (Sofia Kovalevskaya)

How beautiful is this fractal! In previous posts I colored plots using module of complex numbers generated after some iterations. In this occasion I have used the escape-time algorithm, a very well known coloring algorithm which is very easy to implement in R.
Collatz07
Those who want to know more about this fractal can go here. For coloring, I chose a simple scale from red to yellow resulting a fractal interpretation of my country’s flag. You can choose another scale or use a RColorBrewer palette as I did in this previous post. Choosing another x or y ranges you can zoom particular areas of the fractal.

Try yourself and send me your pictures!

library(ggplot2)
xrange <- seq(-8, 8, by = 0.01)
yrange <- seq(-3, 3, by = 0.01)
f  <- function (z) {1/4*(2+7*z-(2+5*z)*cos(pi*z))}
z <- outer(xrange, 1i*yrange,'+')
t <- mat.or.vec(nrow(z), ncol(z))
for (k in 1:10)
{
  z <- f(z)
  t <- t + (is.finite(z)+0)
}
## Supressing texts, titles, ticks, background and legend.
opt <- theme(legend.position="none",
             panel.background = element_blank(),
             axis.ticks=element_blank(),
             axis.title=element_blank(),
             axis.text =element_blank())
z <- data.frame(expand.grid(x=xrange, y=yrange), z=as.vector(t))
ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradient(low="red", high="yellow") + opt

Blurry Fractals

Beauty is the first test; there is no permanent place in the world for ugly mathematics (G. H. Hardy)

Newton basin fractals are the result of iterating Newton’s method to find roots of a polynomial over the complex plane. It maybe sound a bit complicated but is actually quite simple to understand. Those who would like to read some more about Newton basin fractals can visit this page.

This fractals are very easy to generate in R and produce very nice images. Making a small number of iterations, resulting images seems to be blurred when are represented with tile geometry in ggplot. Combined with palettes provided by RColorBrewer give rise to very interesting images. Here you have some examples:

Result for f(z)=z3-1 and palette equal to Set3:Blurry1-Set3Result for f(z)=z4+z-1 and palette equal to Paired:Blurry2-PairedResult for f(z)=z5+z3+z-1 and palette equal to Dark2:Blurry3-Dark2Here you have the code. If you generate nice pictures I will be very grateful if you send them to me:

library(ggplot2)
library(numDeriv)
library(RColorBrewer)
library(gridExtra)
## Polynom: choose only one or try yourself
f  <- function (z) {z^3-1}        #Blurry 1
#f  <- function (z) {z^4+z-1}     #Blurry 2
#f  <- function (z) {z^5+z^3+z-1} #Blurry 3
z <- outer(seq(-2, 2, by = 0.01),1i*seq(-2, 2, by = 0.01),'+')
for (k in 1:5) z <- z-f(z)/matrix(grad(f, z), nrow=nrow(z))
## Supressing texts, titles, ticks, background and legend.
opt <- theme(legend.position="none",
             panel.background = element_blank(),
             axis.ticks=element_blank(), 
             axis.title=element_blank(), 
             axis.text =element_blank())
z <- data.frame(expand.grid(x=seq(ncol(z)), y=seq(nrow(z))), z=as.vector(exp(-Mod(f(z)))))
# Create plots. Choose a palette with display.brewer.all()
p1 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(8, "Paired")) + opt
p2 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(7, "Paired")) + opt
p3 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(6, "Paired")) + opt
p4 <- ggplot(z, aes(x=x, y=y, color=z)) + geom_tile() + scale_colour_gradientn(colours=brewer.pal(5, "Paired")) + opt
# Arrange four plots in a 2x2 grid
grid.arrange(p1, p2, p3, p4, ncol=2)

Random Love

Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin (John von Newman)

Ms. Positive and Mr. Negative live in a one-dimensional world and are falling in love. But beginnings are not always easy. They have a big problem: none of them like the other’s neighborhood. Ms. Positive only wants to walk around Positive Integer Numbers Neighborhood and Mr. Negative around Negative Integers Numbers one. This is a prickly problem they need to deal with as soon as possible. But they have a good idea. They will start their walks from Zero, an impartial place between both neighborhoods and will let fate to guide their feet. They will toss a coin to decide every step: if result is head, they will advance 1 step toward positive numbers neighborhood; if tail, they will advance 1 step toward negative numbers one. For example, if the first 5 tosses are face, face, tail, tail and tail, the their first 5 steps will be +1, +2, +1, 0 and -1. It seems to be a fair agreement for both. Maybe is not the most pleasant way to take a walk but It is well known that lovers use to do silly things constantly, especially at the beginnings. They always walk for two hours, so they toss the coin 7.200 times every walk (these lovers are absolutely crazy as you can see). This was their first walk:

plot1

After this first walk, Mr Negative was really upset. Ms. Positive, watching his face fell, ask him: What’s the matter, honey? and Mr. Negative replied: What’s the matter? What’s the matter? The matter is that we spent almost all the time walking around your horrible neighborhood! What comes next is too hard to be reproduced here. Anyway, they agreed to give a chance to the method they designed. How can one imagine that a coin can produce such a strange walk! There must be an error! After 90 walks, the situation of our lovers was extremely delicate. A 57% of the walks were absolutely awful for one of them since more than 80% of the steps were around the same neighborhood. Another 32% were a bit uncomfortable for one of them since between 60% and 80% of the steps were around the same neighborhood. Only 11% of the walks were gratifying. How is it possible?, said Mr. Negative. How is it possible?, said Ms. Positive.

hist2

But here comes Ms. Positive, who always looks on the brigth side of life: Don’t worry, darling. In fact, we don’t have to be sad. We get angry the same amount of times! For me is enough. What about you?, said her. For me is perfect as well!, said Mr. Negative. In that moment, they realise they were made for each other and started another random walk with a big smile on their faces.

water2

This is the code:

library(ggplot2)
steps   <- 2*60*60 #Number of steps
results <- data.frame()
walks<-90 #Number of walks
for (i in 1:walks)
{
  state <- cumsum(sample(c(-1,1), steps, replace = TRUE))
  results <- rbind(results, c(sum(state<0), sum(state>0), sum(state==0), 
                              if (sum(state<0) >= sum(state>0)) 1 else 0))
}
colnames(results) <- c("neg.steps", "pos.steps", "zero.steps", "ind.neg")
results$max.steps <- apply(results, 1, max)/apply(results, 1, sum)
#Plot of one of these walks
mfar=max(abs(max(state)),abs(min(state)))
plot1 <- qplot(seq_along(state), 
      state, 
      geom="path")+ 
  xlab("Step") + 
  ylab("Location") +
  labs(title = "The First Walk Of Ms. Positive And Mr. Negative")+
  theme(plot.title = element_text(size = 35))+
  theme(axis.title.y = element_text(size = 20))+
  theme(axis.title.x = element_text(size = 20))+
  scale_x_continuous(limits=c(0, length(state)),breaks=c(1,steps/4,steps/2,3*steps/4,steps))+
  scale_y_continuous(limits=c(-mfar, mfar), breaks=c(-mfar,-mfar/2, 0, mfar/2,mfar))+
  geom_hline(yintercept=0)
ggsave(plot1, file="plot1.png", width = 12, height = 10)
#Summary of all walks
hist1 <- ggplot(results, aes(x = max.steps))+
  geom_histogram(colour = "white",breaks=seq(.4,1,by=.2),fill=c("blue", "orange", "red"))+
  theme_bw()+
  labs(title = paste("What Happened After ", toString(walks), " Walks?",sep = ""))+
  scale_y_continuous(breaks=seq(0,(nrow(results[results$max.steps>.8,])+10),by=10))+
  theme(plot.title = element_text(size = 40))+  
  xlab("Maximum Steps In The Same Location (%)") + 
  ylab("Number of Walks")
ggsave(hist1, file="hist1.png", width = 10, height = 8)
#Data for waterfall chart
waterfall <- as.data.frame(cbind(
  c("Total Walks", "Satisfactory Walks", "Uncomfortable Walks", "Awful Walks for Mr. +", "Awful Walks for Ms. -"),
  c("a", "b", "c", "d", "d"),
  c(0, 
    nrow(results),
    nrow(results)-nrow(results[results$max.steps<.6,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,])-nrow(results[results$max.steps>=.8 & results$ind.neg==1,])
    ),
  c(nrow(results),
    nrow(results)-nrow(results[results$max.steps<.6,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,]),
    nrow(results)-nrow(results[results$max.steps<.6,])-nrow(results[results$max.steps>=.6 & results$max.steps<.8,])-nrow(results[results$max.steps>=.8 & results$ind.neg==1,]),
    0
    ),
  c(nrow(results), 
    nrow(results[results$max.steps<.6,]), 
    nrow(results[results$max.steps>=.6 & results$max.steps<.8,]), 
    nrow(results[results$max.steps>=.8 & results$ind.neg==1,]), 
    nrow(results[results$max.steps>=.8 & results$ind.neg==0,]))
))
colnames(waterfall) <-c("desc", "type", "start", "end", "amount")
waterfall$id <- seq_along(waterfall$amount)
waterfall$desc <- factor(waterfall$desc, levels = waterfall$desc)
#Waterfall chart
water1 <- ggplot(waterfall, aes(desc, fill = type)) + 
  geom_rect(aes(x = desc, xmin = id-0.45, xmax = id+0.45, ymin = end, ymax = start))+ 
  xlab("Kind of Walk") + 
  ylab("Number of Walks") +
  labs(title = "The Ultimate Proof (After 90 Walks)")+
  theme(plot.title = element_text(size = 35))+
  theme(axis.title.y = element_text(size = 20))+
  theme(axis.title.x = element_text(size = 20))+
  theme(legend.position = "none")