# The Batman’s Ecosystem

If I weren’t crazy, I’d be insane! (Joker)

I present today a dynamical D3.js plot where I combine three things:

• The Batman curve
• A text mining analysis to obtain most common words from the Batman’s page at Wikipedia
• A line plot using morris.js library of `rCharts` package where point labels are the words obtained in the previous step

This is my particular homage to one of the most amazing superheros ever, together with Daredevil:

The code:

```require(ggplot2)
require(dplyr)
require(rCharts)
library(rvest)
library(tm)
f1u <- function(x) {ifelse ((abs(x) >  3 & abs(x) <= 7), 3*sqrt(1-(x/7)^2), 0)}
f1d <- function(x) {ifelse ((abs(x) >= 4 & abs(x) <= 7), -3*sqrt(1-(x/7)^2), 0)}
f2u <- function(x) {ifelse ((abs(x) > 0.50 & abs(x) < 0.75),  3*abs(x)+0.75, 0)}
f2d <- function(x) {ifelse ((abs(x) > -4 & abs(x) < 4), abs(x/2)-(3*sqrt(33)-7)*x^2/112-3 + sqrt(1-(abs(abs(x)-2)-1)^2), 0)}
f3u <- function(x) {ifelse ((x > -0.5 & x < 0.5), 2.25, 0)}
f4u <- function(x) {ifelse ((abs(x) >  1 & abs(x) <= 3), 6 * sqrt(10)/7 + (1.5 - 0.5 * abs(x)) * sqrt(abs(abs(x)-1)/(abs(x)-1)) - 6 * sqrt(10) * sqrt(4-(abs(x)-1)^2)/14, 0)}
f5u <- function(x) {ifelse ((abs(x) >= 0.75 & abs(x) <= 1), 9-8*abs(x), 0)}
fu <- function (x) f1u(x)+f2u(x)+f3u(x)+f4u(x)+f5u(x)
fd <- function (x) f1d(x)+f2d(x)
batman <- function(r,x) {ifelse(r%%2==0, fu(x), fd(x))}
data.frame(x=seq(from=-7, to=7, by=0.125)) %>%
mutate(y=batman(row_number(), x)) -> df
html("https://en.wikipedia.org/wiki/Batman") %>%
html_nodes("#bodyContent")  %>%
html_text() %>%
VectorSource() %>%
Corpus() %>%
tm_map(tolower) %>%
tm_map(removePunctuation) %>%
tm_map(removeNumbers) %>%
tm_map(stripWhitespace) %>%
tm_map(removeWords, c(stopwords(kind = "en"), "batman", "batmans")) %>%
DocumentTermMatrix() %>%
as.matrix() %>%
colSums() %>%
sort(decreasing=TRUE) %>%
head(n=nrow(df)) %>%
attr("names") -> df\$word
m1=mPlot(x = "x",  y = "y",  data = df,  type = "Line")
m1\$set(pointSize = 5,
lineColors = c('black', 'black'),
width = 900,
height = 500,
hoverCallback = "#! function(index, options, content)
{ var row = options.data[index]
return '<b>' + row.word + '</b>'} !#",
lineWidth = 2,
grid=FALSE,
axes=FALSE)
m1
m1\$save('Batman.html', standalone = TRUE)
```

# Visualising The Evolution Of Migration Flows With rCharts

Heaven we hope is just up the road (Atlas, Coldplay)

Following with the analysis of migration flows, I have done next two visualizations. These charts are called bump charts and are very suitable to represent rankings. This is what I have done:

• Obtaining top 20 countries of the world according to % of migrants respect its population
• To do this, I divide total number of migrants between 1960 and 2009 by the mean population in the same period.
• I do the same to obtain top 20 countries of the world according to % of immigrants.
• In both cases, I only consider countries with population greater than 2 million.
• For these countries, I calculate % of migrants in each decade (60’s, 70’s, 80’s, 90’s and 00’s), dividing total number of migrants by mean population each decade
• I do the same in the case of immigrants.
• Instead of representing directly % of migrants and immigrants, I represent the ranking of countries according these indicators by decade

This is the bump chart of migrants:

And this is the one of immigrants:

Some comments:

• There is a permanent exodus in Puerto Rico: all decades (except 70’s) is located in the top 1 of countries with most migrants respect its population
• Ireland is also living a diaspora although in the 00’s decade has lost some positions
• Albania, Georgia and Bosnia and Herzegovina are gaining positions. Is East Europe gradually becoming uncomfortable?
• Jamaica is also moving up in this sad competition.
• On the other hand, Hong Kong and Israel are persistently leaders as receivers
• Saudi Arabia has presented an impressive growth receiving immigrants since 70’s
• United States does not appear in the immigrants ranking
• Singapore is gaining positions: in the 00’s decade is the third receiver country
• Also in the 00s, Switzerland is the first European country in the ranking, holding the fifth position

I like using rCharts as well as using Enigma data sets, as I have done previously. This is the code:

```library(data.table)
library(rCharts)
library(dplyr)
setwd("YOUR WORKING DIRECTORY HERE")
populflows = read.csv(file="enigma-org.worldbank.migration-remittances.migrants.migration-flow-c57405e33412118c8757b1052e8a1490.csv", stringsAsFactors=FALSE)
population = fread("enigma-org.worldbank.hnp.data-eaa31d1a34fadb52da9d809cc3bec954.csv")
population %>%
filter(indicator_name=="Population, total") %>%
as.data.frame %>%
mutate(decade=(year-year%%10)) %>%
group_by(country_name, country_code, decade) %>%
summarise(population=mean(value)) %>%
plyr::rename(., c("country_name"="country")) -> population2
populflows %>% filter(!is.na(total_migrants)) %>%
group_by(migration_year, destination_country) %>%
summarise(inmigrants = sum(total_migrants))  %>%
plyr::rename(., c("destination_country"="country", "migration_year"="decade"))   -> inmigrants
populflows %>% filter(!is.na(total_migrants)) %>%
group_by(migration_year, country_of_origin) %>%
summarise(migrants = sum(total_migrants)) %>%
plyr::rename(., c("country_of_origin"="country", "migration_year"="decade"))   -> migrants
# Join of data sets
migrants %>%
merge(inmigrants, by = c("country", "decade")) %>%
merge(population2, by = c("country", "decade")) %>%
mutate(p_migrants=migrants/population, p_inmigrants=inmigrants/population) -> populflows2
# Global Indicators
populflows2 %>%
group_by(country) %>%
summarise(migrants=sum(migrants), inmigrants=sum(inmigrants), population=mean(population)) %>%
mutate(p_migrants=migrants/population, p_inmigrants=inmigrants/population)  %>%
filter(population > 2000000)  %>%
mutate(rank_migrants = dense_rank(desc(p_migrants)), rank_inmigrants = dense_rank(desc(p_inmigrants))) -> global
# Migrants dataset
global %>%
filter(rank_migrants<=20) %>%
select(country) %>%
merge(populflows2, by = "country") %>%
arrange(decade, p_migrants) %>%
mutate(decade2=as.numeric(as.POSIXct(paste0(as.character(decade), "-01-01"), origin="1900-01-01"))) %>%
plyr::ddply("decade", transform, rank = dense_rank(p_migrants)) -> migrants_rank
# Migrants dataset
global %>%
filter(rank_inmigrants<=20) %>%
select(country) %>%
merge(populflows2, by = "country") %>%
arrange(decade, p_inmigrants) %>%
mutate(decade2=as.numeric(as.POSIXct(paste0(as.character(decade), "-01-01"), origin="1900-01-01"))) %>%
plyr::ddply("decade", transform, rank = dense_rank(p_inmigrants)) -> inmigrants_rank
# Function for plotting
plotBumpChart <- function(df){
bump_chart = Rickshaw\$new()
mycolors = ggthemes::tableau_color_pal("tableau20")(20)
bump_chart\$layer(rank ~ decade2, group = 'country_code', data = df, type = 'line', interpolation = 'none', colors = mycolors)
bump_chart\$set(slider = TRUE, highlight = TRUE, legend=TRUE)
bump_chart\$yAxis(tickFormat = "#!  function(y) { if (y == 0) { return '' } else { return String((21-y)) } } !#")
bump_chart\$hoverDetail(yFormatter = "#! function(y){return (21-y)} !#")
return(bump_chart)
}
plotBumpChart(migrants_rank)
plotBumpChart(inmigrants_rank)
```

# A Visualization Of The 100 Greatest Love Songs ft. D3.js

What would you do? If my heart was torn in two (More Than Words, Extreme)

Playing with `rCharts` package I had the idea of representing the list of 100 best love songs as a connected set of points which forms a heart. Songs can be seen putting mouse cursor over each dot:

You can reproduce it with this simple code:

```library(dplyr)
library(rCharts)
library(rvest)
setwd("YOUR WORKING DIRECTORY HERE")
heart <- function(r,x) {ifelse(abs(x)<2, ifelse(r%%2==0, sqrt(1-(abs(x)-1)^2), acos(1-abs(x))-pi), 0)} data.frame(x=seq(from=-3, to=3, length.out=100)) %>%
mutate(y=jitter(heart(row_number(), x), amount=.1)) -> df
love_songs <- html("http://www.cs.ubc.ca/~davet/music/list/Best13.html") love_songs %>%
html_nodes("table") %>%
.[] %>%
html_table(header=TRUE, fill = TRUE) %>%
cbind(df) -> df
m1=mPlot(x = "x",  y = "y",  data = df,  type = "Line")
m1\$set(pointSize = 5,
lineColors = c('red', 'red'),
width = 850,
height = 600,
lineWidth = 2,
hoverCallback = "#! function(index, options, content){
var row = options.data[index]
return '<b>' + row.ARTIST + '</b>' + '<br/>' + row.TITLE} !#",
grid=FALSE,
axes=FALSE)
m1\$save('Top_100_Greatest_Love_Songs.html', standalone = TRUE)
```