Who is Alan Turing?

This government is committed to introducing posthumous pardons for people with certain historical sexual offence convictions who would be innocent of any crime now (British Government Spokesperson, September 2016)

Last September, the British government announced its intention to pursue what has become known as the Alan Turing law, offering exoneration to the tens of thousands of gay men convicted of historic charges.  The law was finally unveiled on 20 October 2016.

This plot shows the daily views of the Alan Turing’s wikipedia page during the last 365 days:

There are three huge peaks in May 27th, July 30th, and October 29th that can be easily detected using AnomalyDetection package:


After substituting these anomalies by a simple linear imputation, it is clear that the time series has suffered a significant impact since the last days of September:

To estimate the amount of incremental views since September 28th (this is the date I have chosen as starting point) I use CausalImpact package:


Last plot shows the accumulated effect. After 141 days, there have been around 1 million of incremental views to the Alan Turing’s wikipedia page (more than 7.000 per day) and it does not seem ephemeral.

Alan Turing has won another battle, this time posthumous. And thanks to it, there is a lot of people that have discovered his amazing legacy: long life to Alan Turing.

This is the code I wrote to do the experiment:

library(httr)
library(jsonlite)
library(stringr)
library(xts)
library(highcharter)
library(AnomalyDetection)
library(imputeTS)
library(CausalImpact)
library(dplyr)

# Views last 365 days
(Sys.Date()-365) %>% str_replace_all("[[:punct:]]", "") %>% substr(1,8) -> date_ini
Sys.time()       %>% str_replace_all("[[:punct:]]", "") %>% substr(1,8) -> date_fin
url="https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Alan%20Turing/daily"
paste(url, date_ini, date_fin, sep="/") %>% 
  GET  %>% 
  content("text") %>% 
  fromJSON %>% 
  .[[1]] -> wikistats

# To prepare dataset for highcharter
wikistats %>% 
  mutate(day=str_sub(timestamp, start = 1, end = 8)) %>% 
  mutate(day=as.POSIXct(day, format="%Y%m%d", tz="UTC")) -> wikistats

# Highcharts viz
rownames(wikistats)=wikistats$day
wikistats %>% select(views) %>% as.xts  %>% hchart

# Anomaly detection
wikistats %>% select(day, views) -> tsdf
tsdf %>%  
  AnomalyDetectionTs(max_anoms=0.01, direction='both', plot=TRUE)->res
res$plot

# Imputation of anomalies
tsdf[tsdf$day %in% as.POSIXct(res$anoms$timestamp, format="%Y-%m-%d", tz="UTC"),"views"]<-NA 
ts(tsdf$views, frequency = 365) %>% 
  na.interpolation() %>% 
  xts(order.by=wikistats$day) -> tscleaned
tscleaned %>% hchart

# Causal Impact from September 28th
x=sum(index(tscleaned)<"2016-09-28 UTC")
impact <- CausalImpact(data = tscleaned %>% as.numeric, 
                       pre.period = c(1,x),
                       post.period = c(x+1,length(tscleaned)), 
                       model.args = list(niter = 5000, nseasons = 7),
                       alpha = 0.05)
plot(impact)

5 thoughts on “Who is Alan Turing?

  1. Interesante! que sencillo se hace el experimento utilizando dichos paquetes.

    No se cuanta gente utiliza AnomalyDetection y CausalImpact, pero sería entretenido hacer `hchart` para estos _objetos_ 😀

    Saludos Antonio!

    1. ¡Hola Joshua! Sí, son muy fáciles de usar. Los dos paquetes creo que los han hecho gente de Twitter. Yo creo que van a tener éxito. Si te animas a hacer algo con highcharter yo seguro que lo uso. ¡Un abrazo!

Leave a Reply

Your email address will not be published. Required fields are marked *