# How Much Money Should Machines Earn?

Every inch of sky’s got a star
Every inch of skin’s got a scar
(Everything Now, Arcade Fire)

I think that a very good way to start with R is doing an interactive visualization of some open data because you will train many important skills of a data scientist: loading, cleaning, transforming and combinig data and performing a suitable visualization. Doing it interactive will give you an idea of the power of R as well, because you will also realise that you are able to handle indirectly other programing languages such as JavaScript.

That’s precisely what I’ve done today. I combined two interesting datasets:

• The probability of computerisation of 702 detailed occupations, obtained by Carl Benedikt Frey and Michael A. Osborne from the University of Oxford, using a Gaussian process classifier and published in this paper in 2013.
• Statistics of jobs from (employments, median annual wages and typical education needed for entry) from the US Bureau of Labor, available here.

Apart from using `dplyr` to manipulate data and `highcharter` to do the visualization, I used `tabulizer` package to extract the table of probabilities of computerisation from the `pdf`: it makes this task extremely easy.

This is the resulting plot:

If you want to examine it in depth, here you have a full size version.

These are some of my insights (its corresponding figures are obtained directly from the dataset):

• There is a moderate negative correlation between wages and probability of computerisation.
• Around 45% of US employments are threatened by machines (have a computerisation probability higher than 80%): half of them do not require formal education to entry.
• In fact, 78% of jobs which do not require formal education to entry are threatened by machines: 0% which require a master’s degree are.
• Teachers are absolutely irreplaceable (0% are threatened by machines) but they earn a 2.2% less then the average wage (unfortunately, I’m afraid this phenomenon occurs in many other countries as well).
• Don’t study for librarian or archivist: it seems a bad way to invest your time
• Mathematicians will survive to machines

The code of this experiment is available here.

# Visualizing the Daily Variability of Bitcoin with Quandl and Highcharts

Lay your dreams, little darling, in a flower bed; let that sunshine in your hair (Where the skies are blue, The Lumineers)

I discovered this nice visualization some days ago. The author is also the creator of Highcharter, an incredible R wrapper for Highcharts javascript libray and its modules. I am a big fan of him.

Inspired by his radial plot, I did a visualization of the daily evolution of Daily Bitcoin exchange rate (BTC vs. EUR) on Localbtc. Data is sourced from here and I used Quandl to obtain the data frame. Quandl is a marketplace for financial and economic data delivered in modern formats for today’s analysts. There is a package called `Quandl` to interact directly with the Quandl API to download data in a number of formats usable in R. You only need to locate the data you want in the Quandl site. In my case data are here.

After loading data, I do the folowing steps:

• Filtering data to obtain last 12 complete months
• Create a new variable with the difference between closing and opening price of Bitcoin (in Euros)
• Create a color variable to distinguish between positive and negative differences
• Create the graph using Fivethirtyeight theme for highcharts

This is the result:

Apart of its appealing, I think is a good way to to have a quick overview of the evolution of a stock price. This is the code to do the experiment:

```library(Quandl)
library(dplyr)
library(highcharter)
library(lubridate)
bitcoin=Quandl("BCHARTS/LOCALBTCEUR")
bitcoin %>%
arrange(Date) %>%
mutate(tmstmp = datetime_to_timestamp(Date)) -> bitcoin
last_date=max(bitcoin\$Date)
if (day(last_date+1)==1) date_to=last_date else
date_to=ymd(paste(year(last_date), month(last_date),1, sep="-"))-1
date_from=ymd(paste(year(date_to)-1, month(date_to)+1,1, sep="-"))
bitcoin %>% filter(Date>=date_from, Date<=date_to) -> bitcoin
var_bitcoin <- bitcoin %>%
mutate(Variation = Close - Open,
color = ifelse(Variation>=0, "green", "red"),
y = Variation) %>%
select(x = tmstmp,
y,
variation = Variation,
name = Date,
color,
open = Open,
close = Close) %>%
list.parse3()
x <- c("Open", "Close", "Variation")
y <- sprintf("{point.%s}", tolower(x))
tltip <- tooltip_table(x, y)
hc <- highchart() %>%
hc_title(text = "Bitcoin Exchange Rate (BTC vs. EUR)") %>%
hc_subtitle(text = "Daily Variation on Localbtc. Last 12 months")%>%
hc_chart(
type = "column",
polar = TRUE) %>%
hc_plotOptions(
series = list(
stacking = "normal",
showInLegend = FALSE)) %>%
hc_xAxis(
gridLineWidth = 0.5,
type = "datetime",
tickInterval = 30 * 24 * 3600 * 1000,
labels = list(format = "{value: %b}")) %>%
hc_yAxis(showFirstLabel = FALSE) %>%
hc_add_series(data = var_bitcoin) %>%