# Pencil Scribbles

Con las bombas que tiran los fanfarrones, se hacen las gaditanas tirabuzones (Palma y corona, Carmen Linares)

This time I draw Franky again using an algorithm to solve the Travelling Salesman Problem as I did in my last post. On this occasion, instead of doing just one single line drawing, I overlap many of them (250 concretely), each of them sampling 400 points on the original image (in my previous post I sampled 8.000 points). Last difference is that I don’t convert the image to pure black and white with `threshold` function: now I use the gray scale number of each pixel to weight the sample.

Once again, I use `ggplot2` package, and its magical `geom_path`, to generate the image. The pencil effect is obtained giving a very high transparency to the lines. This is the result:

I love when someone else experiment with my experiments as Mara Averick did:

You can do it as well with this one, since you will find the code here. Please, let me know your own creations if you do. You can find me on twitter or by email.

P.S.: Although it may seems otherwise, I’m not obsessed with Frankenstein 🙂

# The Travelling Salesman Portrait

I have noticed even people who claim everything is predestined, and that we can do nothing to change it, look before they cross the road (Stephen Hawking)

Imagine a salesman and a set of cities. The salesman has to visit each one of the cities starting from a certain one and returning to the same city. The challenge is finding the route which minimizes the total length of the trip. This is the Travelling Salesman Problem (TSP): one of the most profoundly studied questions in computational mathematics. Since you can find a huge amount of articles about the TSP in the Internet, I will not give more details about it here.

In this experiment I apply an heuristic algorithm to solve the TSP to draw a portrait. The idea is pretty simple:

• Convert it to black and white
• Choose a sample of black points
• Solve the TSP to calculate a route among the points
• Plot the route

The result is a single line drawing of the image that you loaded. To solve the TSP I used the arbitrary insertion heuristic algorithm (Rosenkrantz et al. 1977), which is quite efficient.

To illustrate the idea, I have used again this image of Frankenstein (I used it before in this other experiment). This is the result:

You can find the code here.

# Mandalas Colored

Apriétame bien la mano, que un lucero se me escapa entre los dedos (Coda Flamenca, Extremoduro)

I have the privilege of being teacher at ESTALMAT, a project run by Spanish Royal Academy of Sciences that tries to detect, guide and stimulate in a continuous way, along two courses, the exceptional mathematical talent of students of 12-13 years old. Some weeks ago I gave a class there about the importance of programming. I tried to convince them that learning R or Python is a good investment that always pays off; It will make them enjoy more of mathematics as well as to see things with their own eyes. The main part of my class was a workshop about Voronoi tesselations in R. We started drawing points on a circle and we finished drawing mandalas like these ones. You can find the details of the workshop here (in Spanish). It was a wonderful experience to see the faces of the students while generating their own mandalas.

In that case all mandalas were empty, ready to be printed and coloured as my 7 years old daughter does. In this experiment I colour them. These are the changes I have done to my  previous code:

• Remove external segments which intersects the boundary of the enclosing
rectangle
• Convert the tesselation into a list of polygons with `tile.list` function
• Use `colourlovers` package to fill the polygons with beautiful colour palettes

This is an example of the result:

Changing three simple parameters (`iter`, `points` and `radius`) you can obtain completely different images (clicking on any image you can see its full size version):

You can find details of these parameters in my previous post. I cannot resist to place more examples:

You can find the code here. Enjoy.

# Mandalas

Mathematics is a place where you can do things which you can’t do in the real world (Marcus Du Sautoy, mathematician)

From time to time I have a look to some of my previous posts: it’s like seeing them through another’s eyes. One of my first posts was this one, where I draw fractals using the Multiple Reduction Copy Machine (MRCM) algorithm. That time I was not clever enough to write an efficient code able generate deep fractals. Now I am pretty sure I could do it using `ggplot` and I started to do it when I come across with the idea of mixing this kind of fractal patterns with Voronoi tessellations, that I have explored in some of my previous posts, like this one. Mixing both techniques, the mandalas appeared.

I will not explain in depth the mathematics behind this patterns. I will just give a brief explanation:

• I start obtaining `n` equidistant points in a unit circle centered in `(0,0)`
• I repeat the process with all these points, obtaining again `n` points around each of them; the radius is scaled by a factor
• I discard the previous (parent) `n` points

I repeat these steps iteratively. If I start with n points and iterate k times, at the end I obtain nk points. After that, I calculate the Voronoi tesselation of them, which I represent with `ggplot`.

This is an example:

Some others:

You can find the code here. Enjoy it.

# Fatal Journeys: Visualizing the Horror

In war, truth is the first casualty (Aeschylus)

I am not a pessimistic person. On the contrary, I always try to look at the bright side of life. I also believe that living conditions are now better than years ago as these plots show. But reducing the complexity of our world to just six graphs is riskily simplistic. Our world is quite far of being a fair place and one example is the immigration drama.

Last year there were 934 incidents around the world involving people looking for a better life where more than 5.300 people lost their life o gone missing, 60% of them in Mediterranean. Around 8 out of 100 were children.

The missing migrant project tracks deaths of migrants, including refugees and asylum-seekers, who have gone missing along mixed migration routes worldwide. You can find a huge amount of figures, plots and information about this scourge in their website. You can also download there a historical dataset with information of all these fatal journeys, including location, number of dead or missing people and information source from 2015 until today.

I this experiment I read the dataset and do some plots using `highcharter`; you can find a link to the R code at the end of the post.

This is the evolution of the amount of deaths or missing migrants in the process of migration towards an international destination from January 2015 to December 2017:

The Mediterranean is the zone with the most incidents. To see it more clearly, this plot compares Mediterranean with the rest of the world, grouping previous zones:

Is there any pattern in the time series of Mediterranean incidents? To see it, I have done a LOESS decomposition of the time series:

Good news: trend is decreasing for last 12 months. Regarding seasonal component, incidents increase in April and May. Why? I don’t know.

This is a map of the location of all incidents in 2017. Clicking on markers you will find information about each incident:

Every of us should try to make our world a better place. I don’t really know how to do it but I will try to make some experiments during this year to show that we have tons of work in front of us. Meanwhile, I hope this experiment is useful to give visibility to this humanitarian disaster. If someone wants to use the code, the complete project is available in GitHub.

# Tiny Art in Less Than 280 Characters

Now that Twitter allows 280 characters, the code of some drawings I have made can fit in a tweet. In this post I have compiled a few of them.

The first one is a cardioid inspired in string art (more info here):

```library(ggplot2)
n=300
t1=1:n
t0=seq(3,2*n+1,2)%%n
t2=t0+(t0==0)*n
df=data.frame(x=cos((t1-1)*2*pi/n),
y=sin((t1-1)*2*pi/n),
x2=cos((t2-1)*2*pi/n),
y2=sin((t2-1)*2*pi/n))
ggplot(df,aes(x,y,xend=x2,yend=y2)) +
geom_segment(alpha=.1)+theme_void()
```

```library(ggplot2)
library(dplyr)
t=seq(from=0, to=100*pi, length.out=500*100)
data.frame(x= t^(1/2)*cos(t), y= t^(1/2)*sin(t))%>%
rbind(-.)%>%ggplot(aes(x, y))+geom_polygon()+theme_void()
```

```library(dplyr)
library(ggplot2)
library(pracma)
seq(-5*pi,5*pi,by=.1)%>%expand.grid(x=., y=.)%>%
ggplot(aes(x=x, y=y, fill=erf(sec(x)-sec(y))))+geom_tile()+
theme_void()+theme(legend.position="none")
```

A x-y scatter plot of a trigonometric function on R2 (more info here):

```library(dplyr)
library(ggplot2)
seq(from=-10, to=10, by = 0.05) %>%
expand.grid(x=., y=.) %>%
ggplot(aes(x=(x+pi*sin(y)), y=(y+pi*sin(x)))) +
geom_point(alpha=.1, shape=20, size=1, color="black")+
theme_void()
```

```library(TurtleGraphics)
turtle_init()
turtle_col("gray25")
turtle_do({
for (i in 1:150) {
turtle_forward(dist=1+0.5*i)
turtle_right(angle=89.5)}
})
turtle_hide()
```

```t=seq(1, 100, by=.001)
plot(exp(-0.006*t)*sin(t*3.019+2.677)+
exp(-0.001*t)*sin(t*2.959+2.719),
exp(-0.009*t)*sin(t*2.964+0.229)+
exp(-0.008*t)*sin(t*2.984+1.284),
type="l", axes=FALSE, xlab="", ylab="")
```

```library(circlize)
chordDiagram(matrix(1, 20, 20), symmetric = TRUE,
col="black", transparency = 0.85, annotationTrack = NULL)
```

Most of them are made with `ggplot2` package. I love R and the sense of wonder of how just one or two lines of code can create beautiful and unexpected patterns.

I recently did this project for DataCamp to show how easy is to do art with R and `ggplot`. Starting from a extremely simple plot, and following a well guided path, you can end making beautiful images like this one:

Furthermore, you can learn also `ggplot2` while you do art.

I have done the project together with Rasmus Bååth, instructor at DataCamp and the perfect mate to work with. He is looking for people to build more projects so if you are interested, here you can find more information. Do not hesitate to ask him for details.

All the best for 2018.

Merry Christmas.

# Drawing 10 Million Points With ggplot: Clifford Attractors

For me, mathematics cultivates a perpetual state of wonder about the nature of mind, the limits of thoughts, and our place in this vast cosmos (Clifford A. Pickover – The Math Book: From Pythagoras to the 57th Dimension, 250 Milestones in the History of Mathematics)

I am a big fan of Clifford Pickover and I find inspiration in his books very often. Thanks to him, I discovered the harmonograph and the Parrondo’s paradox, among many other mathematical treasures. Apart of being a great teacher, he also invented a family of strange attractors wearing his name. Clifford attractors are defined by these equations:

$x_{n+1}\, =\, sin(a\, y_{n})\, +\, c\, cos(a\, x_{n}) \\ y_{n+1}\, =\, sin(b\, x_{n})\, +\, d\, cos(b\, y_{n}) \\$

There are infinite attractors, since a, b, c and d are parameters. Given four values (one for each parameter) and a starting point `(x0, y0)`, the previous equation defines the exact location of the point at step `n`, which is defined just by its location at `n-1`; an attractor can be thought as the trajectory described by a particle. This plot shows the evolution of a particle starting at `(x0, y0)=(0, 0)` with parameters `a=-1.24458046630025`, `b=-1.25191834103316`, `c=-1.81590817030519` and `d=-1.90866735205054` along 10 million of steps:

Changing parameters is really entertaining. Drawings have a sandy appearance:

From a technical point of view, the challenge is creating a data frame with all locations, since it must have 10 milion rows and must be populated sequentially. A very fast way to do it is using `Rcpp` package. To render the plot I use ggplot, which works quite well. Here you have the code to play with Clifford Attractors if you want:

```library(Rcpp)
library(ggplot2)
library(dplyr)

opt = theme(legend.position  = "none",
panel.background = element_rect(fill="white"),
axis.ticks       = element_blank(),
panel.grid       = element_blank(),
axis.title       = element_blank(),
axis.text        = element_blank())

cppFunction('DataFrame createTrajectory(int n, double x0, double y0,
double a, double b, double c, double d) {
// create the columns
NumericVector x(n);
NumericVector y(n);
x[0]=x0;
y[0]=y0;
for(int i = 1; i < n; ++i) {
x[i] = sin(a*y[i-1])+c*cos(a*x[i-1]);
y[i] = sin(b*x[i-1])+d*cos(b*y[i-1]);
}
// return a new data frame
return DataFrame::create(_["x"]= x, _["y"]= y);
}
')

a=-1.24458046630025
b=-1.25191834103316
c=-1.81590817030519
d=-1.90866735205054

df=createTrajectory(10000000, 0, 0, a, b, c, d)

png("Clifford.png", units="px", width=1600, height=1600, res=300)
ggplot(df, aes(x, y)) + geom_point(color="black", shape=46, alpha=.01) + opt
dev.off()
```

# A Shiny App to Create Sentimental Tweets Based on Project Gutenberg Books

There was something about them that made me uneasy, some longing and at the same time some deadly fear – Dracula (Stoker, Bram)

Twitter is a very good source of inspiration. Some days ago I came across with this:

The tweet refers to a presentation (in Spanish) available here, which is a very concise and well illustrated document about the state-of-the-art of text mining in R. I discovered there several libraries that I will try to use in the future. In this experiment I have used one of them: the `syuzhet` package. As can be read in the documentation:

this package extracts sentiment and sentiment-derived plot arcs from text using three sentiment dictionaries conveniently packaged for consumption by R users. Implemented dictionaries include `syuzhet` (default) developed in the Nebraska Literary Lab, `afinn` developed by Finn Arup Nielsen, `bing` developed by Minqing Hu and Bing Liu, and `nrc` developed by Mohammad, Saif M. and Turney, Peter D.

You can find a complete explanation of the package in its vignette. A very interesting application of these techniques is the Sentiment Graph of a book, which represents how sentiment changes over time. This is the Sentiment Graph of Romeo and Juliet, by William Shakespeare, taken from Project Alexandria:

Darkest sentiments can be seen at the end of the book, where the tragedy reaches its highest level. It is also nice to see how sentiments are cyclical. This graphs can be very useful for people who just want to read happy endings books (my sister is one of those).

Inspired by this analysis, I have done another experiment in which I download a book from Project Gutenberg and measure sentiment of all its sentences. Based on this measurement, I filter top 5% (positive or negative sentiment) sentences to build tweets. I have done a Shiny app where all these steps are explained. The app is available here.

From a technical point of view I used `selectize` JavaScript library to filter books in a flexible way. I customized as well the appearance with CSS bootstrap from Bootswatch as explained here.

This is the code of the experiment.

`UI.R`:

```library(shiny)

fluidPage(theme = "bootstrap.css",

titlePanel(h1("Sentimental Tweets from Project Gutenberg Books", align="center"),
windowTitle="Tweets from Project Gutenberg"),
sidebarLayout(
sidebarPanel(

selectInput(
'book', 'Choose a book:',
multiple=FALSE,
selectize = TRUE,
choices=c("Enter some words of title or author" = "", gutenberg_works\$searchstr)
),

label = "Choose sentiment:",
choices = c("Dark"="1", "Bright"="20"),
selected="1",
inline=TRUE),

label = "Choose a method to measure sentiment:",
choices = c("syuzhet", "bing", "afinn", "nrc"),
selected="syuzhet",
inline=TRUE),

label = "Number of characters (max):",
choices = list("140", "280"),
inline=TRUE),

checkboxInput(inputId = "auth",
value=FALSE),

checkboxInput(inputId = "titl",
value=FALSE),

checkboxInput(inputId = "post",
value=TRUE),

label="Something else?",
placeholder="Maybe a #hastag?"),

actionButton('do','Go!',
class="btn btn-success action-button",
css.class="btn btn-success")
),

mainPanel(
tags\$br(),
p("First of all, choose a book entering some keywords of its
title or author and doing dropdown navigation. Books are
catalog", tags\$a(href = "https://www.gutenberg.org/catalog/", "here.")),

p("After that, choose the sentiment of tweets you want to generate.
There are four possible methods than can return slightly different results.
All of them assess the sentiment of each word of a sentence and sum up the
result to give a scoring for it. The more negative is this scoring,
the", em("darker") ,"is the sentiment. The more positive, the ", em("brighter."),
" You can find a nice explanation of these techniques",
tags\$a(href = "http://www.matthewjockers.net/2017/01/12/resurrecting/", "here.")),

p("Next parameters are easy: you can add the title and author of the book where
sentence is extracted as well as a link to my blog and any other string you want.
Clicking on the lower button you will get after some seconds a tweet below.
Click as many times you want until you like the result."),

p("Finally, copy, paste and tweet. ",strong("Enjoy it!")),
tags\$br(),
tags\$blockquote(textOutput("tweet1")),
tags\$br()

)))
```

`Server.R`:

```library(shiny)

function(input, output) {

values <- reactiveValues(default = 0)

observeEvent(input\$do,{
values\$default <- 1
})

book <- eventReactive(input\$do, {
GetTweet(input\$book, input\$meth, input\$sent, input\$char,
})

output\$tweet1 <- renderText({
if(values\$default == 0){
"Your tweet will appear here ..."
}
else{
book()
}
})
}
```

`Global.R`:

```library(gutenbergr)
library(dplyr)
library(stringr)
library(syuzhet)

x <- tempdir() # Read the Project Gutenberg catalog and filter english works. I also create a column with # title and author to make searchings gutenberg_metadata %>%
filter(has_text, language=="en", gutenberg_id>0, !is.na(author)) %>%
mutate(searchstr=ifelse(is.na(author), title, paste(title, author, sep= " - "))) %>%
mutate(searchstr=str_replace_all(searchstr, "[\r\n]" , "")) %>%
group_by(searchstr) %>%
summarize(gutenberg_id=min(gutenberg_id)) %>%
ungroup() %>%
na.omit() %>%
filter(str_length(searchstr)<100)-> gutenberg_works

# This function generates a tweet according the UI settings (book, method, sentiment and
# number of characters). It also appends some optional strings at the end
GetTweet = function (string, method, sentim, characters,
{
# Obtain gutenberg_id from book
gutenberg_works %>%
filter(searchstr == string) %>%
select(gutenberg_id) %>% .\$gutenberg_id -> result

# Download text, divide into sentences and score sentiment. Save results to do it once and
# optimize performance
if(!file.exists(paste0(x,"/","book",result,"_",method,".RDS")))
{
book[,2] %>%
as.data.frame() %>%
.\$text %>%
paste(collapse=" ") -> text

sentences_v <- get_sentences(text)
sentiment_v <- get_sentiment(sentences_v, method=method) data.frame(sentence=sentences_v, sentiment=sentiment_v) %>%
mutate(length=str_length(sentence)) -> results
saveRDS(results, paste0(x,"/","book",result,"_",method,".RDS"))
}

# Paste optional strings to append at the end
post=""
if (title)  post=paste("-", book_info[,"title"], post, sep=" ")
if (author) post=paste0(post, " (", str_trim(book_info[,"author"]), ")")
if (link)   post=paste(post, "https://wp.me/p7VZWY-16S", sep=" ")
post=paste(post, hastag, sep=" ")
length_post=nchar(post)

# Calculate 5% quantiles
results %>%
filter(length<=(as.numeric(characters)-length_post)) %>%
mutate(sentiment=jitter(sentiment)) %>%
mutate(group = cut(sentiment,
include.lowest = FALSE,
labels = FALSE,
breaks = quantile(sentiment, probs = seq(0, 1, 0.05)))) -> results

# Obtain a sample sentence according sentiment and append optional string to create tweet
results %>%
filter(group==as.numeric(sentim)) %>%
sample_n(1) %>%
select(sentence) %>%
.\$sentence %>%
as.character() %>%
str_replace_all("[.]", "") %>%
paste(post, sep=" ") -> tweet

return(tweet)

}
```

# Visualizing the Spanish Contribution to The Metropolitan Museum of Art

Well I walk upon the river like it’s easier than land
(Love is All, The Tallest Man on Earth)

The Metropolitan Museum of Art provides here a dataset with information on more than 450.000 artworks in its collection. You can do anything you want with these data: there are no restrictions of use. Each record contains information about the author, title, type of work, dimensions, date, culture and  geography of a particular piece.

I can imagine a bunch of things to do with these data but since I am a big fan of highcharter,  I have done a treemap, which is an artistic (as well as efficient) way to visualize hierarchical data. A treemap is useful to visualize frequencies. They can handle levels, allowing to navigate to go into detail about any category. Here you can find a good example of treemap.

To read data I use `fread` function from `data.table` package. I also use this package to do some data wrangling operations on the data set. After them, I filter it looking for the word `SPANISH` in the columns `Artist Nationality` and `Culture` and looking for the word `SPAIN` in the column `Country`. For me, any piece created by an Spanish artist (like this one), coming from Spanish culture (like this one) or from Spain (like this one) is Spanish (this is my very own definition and may do not match with any academical one). Once it is done, it is easy to extract some interesting figures:

• There are 5.294 Spanish pieces in The Met, which means a 1,16% of the collection
• This percentage varies significantly between departments: it raises to 9,01% in The Cloisters and to 4,83% in The Robert Lehman Collection; on the other hand, it falls to 0.52% in The Libraries and to 0,24% in Photographs.
• The Met is home to 1.895 highlights and 44 of them (2,32%) are Spanish; It means that Spanish art is twice as important as could be expected (remember that represents a 1,16% of the entire collection)

My treemap represents the distribution of Spanish artworks by department (column `Department`) and type of work (column `Classification`). There are two important things to know before doing a treemap with `highcharter`:

• You have to use `treemap` function from `treemap` package to create a list with your data frame that will serve as input for `hctreemap` function
• `hctreemap `fails if some category name is the same as any of its subcategories. To avoid this, make sure that all names are distinct.

This is the treemap:

Here you can see a full size version of it.

There can be seen several things at a glance: most of the pieces are drawings and prints and european sculpture and decorative arts (in concrete, prints and textiles), there is also big number of costumes, arms and armor is a very fragmented department … I think treemap is a good way to see what kind of works owns The Met.

Mi favorite spanish piece in The Met is the stunning Portrait of Juan de Pareja by Velázquez, which illustrates this post: how nice would be to see it next to El Primo in El Museo del Prado!

Feel free to use my code to do your own experiments:

```library(data.table)
library(dplyr)
library(stringr)
library(highcharter)
library(treemap)

file="MetObjects.csv"
destfile=file,
mode='wb')

# Modify column names to remove blanks
colnames(data)=gsub(" ", ".", colnames(data))

# Clean columns to prepare for searching
data[,`:=`(Artist.Nationality_aux=toupper(Artist.Nationality) %>% str_replace_all("\\[\\d+\\]", "") %>%
iconv(from='UTF-8', to='ASCII//TRANSLIT'),
Culture_aux=toupper(Culture) %>% str_replace_all("\\[\\d+\\]", "") %>%
iconv(from='UTF-8', to='ASCII//TRANSLIT'),
Country_aux=toupper(Country) %>% str_replace_all("\\[\\d+\\]", "") %>%
iconv(from='UTF-8', to='ASCII//TRANSLIT'))]

# Look for Spanish artworks
data[Artist.Nationality_aux %like% "SPANISH" |
Culture_aux %like% "SPANISH" |
Country_aux %like% "SPAIN"] -> data_spain

# Count artworks by Department and Classification
data_spain %>%
mutate(Classification=ifelse(Classification=='', "miscellaneous", Classification)) %>%
mutate(Department=tolower(Department),
Classification1=str_match(Classification, "(\\w+)(-|,|\\|)")[,2],
Classification=ifelse(!is.na(Classification1),
tolower(Classification1),
tolower(Classification))) %>%
group_by(Department, Classification) %>%
summarize(Objects=n()) %>%
ungroup %>%
mutate(Classification=ifelse(Department==Classification, paste0(Classification, "#"),
Classification)) %>%
as.data.frame() -> dfspain

# Do treemap without drawing
tm_dfspain <- treemap(dfspain, index = c("Department", "Classification"),
draw=F,
vSize = "Objects",
vColor = "Objects",
type = "index")

# Do highcharter treemap
hctreemap(
tm_dfspain,
allowDrillToNode = TRUE,
allowPointSelect = T,
levelIsConstant = F,
levels = list(
list(
level = 1,
dataLabels = list (enabled = T, color = '#f7f5ed', style = list("fontSize" = "1em")),
borderWidth = 1
),
list(
level = 2,
dataLabels = list (enabled = F,  align = 'right', verticalAlign = 'top',
style = list("textShadow" = F, "fontWeight" = 'light', "fontSize" = "1em")),
borderWidth = 0.7
)
)) %>%
hc_title(text = "Spanish Artworks in The Met") %>%
hc_subtitle(text = "Distribution by Department") -> plot

plot
```

# The Cycling Accident Map of Madrid City

Far away, this ship has taken me far away (Starlight, Muse)

Madrid City has an Open Data platform where can be found around 300 data sets about a number of topics. One of these sets is the one I used for this experiment. It contains information about cycling accidents  happened in the city from January to July 2017. I have done a map to locate where the accidents took place. This experiment shows how R makes very easy to create professional maps with Leaflet (in this case I use Carto basemaps).

To locate accidents the data set only contains the address where they happened so the first thing I did is to obtain their geographical coordinates using `geocode` function from `ggmap` package. There were 431 accidents during the first 7 months of 2017 (such a big number!) and I got coordinates of 407 so I can locate 94% of the accidents.

Obviously, the amount of accidents in some place depend on how many bikers circulate there as well as on its infrastructure. None of these things can be seen in the map: It only shows number of accidents.

The categorization of accidents is:

• Double collision (Colisión doble): Traffic accident occurred between two moving vehicles.
• Multiple collision (Colisión múltiple): Traffic accident occurred between more than two moving vehicles.
• Fixed object collision (Choque con objeto fijo): Accident occurred between a moving vehicle with a driver and an immovable object that occupies the road or separated area of ​​the same, whether parked vehicle, tree, street lamp, etc.
• Accident (Atropello): Accident occurred between a vehicle and a pedestrian that occupies the road or travels by sidewalks, refuges, walks or zones of the public road not destined to the circulation of vehicles.
• Overturn (Vuelco): Accident suffered by a vehicle with more than two wheels which by some circumstance loses contact with the road and ends supported on one side or on its roof.
• Motorcycle fall (Caída motocicleta): Accident suffered by a motorcycle, which at some moment loses balance, because of the driver or due to the conditions of the road.
• Moped fall (Caída ciclomotor): Accident suffered by a moped, which at some moment loses balance, because of the driver or due to the conditions of the road.
• Bicycle fall (Caída bicicleta): Accident suffered by a bicycle, which at some moment loses balance, because of the driver or due to the conditions of the road.

These categories are redundant (e.g. Double and Multiple collision), difficult to understand (e.g. Overturn) or both things at the same time (e.g. Motorcycle fall and Moped fall). This categorization also forgets human damages incurred by the accident.

Taking all these things in mind, this is the map:

Here is a full-screen version of the map.

My suggestions to the city council of Madrid are:

1. Add geographical coordinates to data (I guess many of the analysis will need them)
3. Add more cycling data sets to the platform (detail of bikeways, traffic …) to understand accidents better
4. Attending just to the number of accidents , put the focus around Parque del Retiro, specially on its west surroundings, from Plaza de Cibeles to Plaza de Carlos V: more warning signals, more  (or better) bikeways …

I add the code below to update the map (If someone ask it to me, I can do it myself regularly):

```library(dplyr)
library(stringr)
library(ggmap)
library(leaflet)
# First, getting the data
destfile="300110-0-accidentes-bicicleta.csv")

# Prepare data for geolocation
data %>%
mutate(direccion=paste(str_trim(Lugar), str_trim(Numero), "MADRID, SPAIN", sep=", ") %>%
str_replace("NA, ", "") %>%
str_replace(" - ", " CON ")) -> data

# Geolocation (takes some time ...)
coords=c()
for (i in 1:nrow(data))
{
coords %>% rbind(geocode(data[i,"direccion"])) -> coords
Sys.sleep(0.5)
}

# Save data, just in case
data %>% cbind(coords) %>% saveRDS(file="bicicletas.RDS")

# Remove non-successfull geolocations
data %>%
filter(!is.na(lon)) %>%
droplevels()-> data

# Remove non-successfull geolocations
data %>% mutate(Fecha=paste0(as.Date(data\$Fecha, "%d/%m/%Y"), " ", TRAMO.HORARIO),
popup=paste0("<b>Dónde:</b>",
direccion,
"<b>Cuándo:</b>",
Fecha,
"<b>Qué pasó:</b>",
Tipo.Accidente)) -> data

# Do the map
data %>% split(data\$Tipo.Accidente) -> data.df

names(data.df) %>%
purrr::walk( function(df) {
l <<- l %>%
lng=~lon, lat=~lat,
popup=~popup,
color="red",
stroke=FALSE,
fillOpacity = 0.8,
group = df,
clusterOptions = markerClusterOptions(removeOutsideVisibleBounds = F))
})

l %>%