Tag Archives: tidyverse

Allusions to parents in autobiographies (or reading 118 books in a few seconds)

If I keep holding out, will the light shine through? (Come Back, Pearl Jam)

Imagine that you are writing the story of your life. Almost sure you will make allusions to your parents, but will both of them have the same prominence in your biography or will you spend more words in one of them? In that case, which one will have more relevance? Your father or your mother?

This experiment analyses 118 autobiographies from the Project Gutenberg and count how many times do authors make allusions to their fathers and mothers. This is what I’ve done:

  • Download all works from Gutenberg Project containing the word autobiography in its title (there are 118 in total).
  • Count how many times the bigrams my father and my mother appear in each text. This is what I call allusions to father and mother respectively.

The number of allusions that I measure is a lower bound of the exact amount of them since the calculus has some limitations:

  • Maybe the author refers to them by their names.
  • After referring to them as my father or my mother, subsequent sentences may refer them as He or She.

Anyway, I think these constrains do not introduce any bias in the calculus since may affect to fathers and mothers equally. Here you can find the dataset I created after downloading all autobiographies and measuring the number of allusions to each parent.

Some results:

  • 64% of autobiographies have more allusions to the father than the mother.
  • 24% of autobiographies have more allusions to the mother than the father.
  • 12% allude them equally.

Most of the works make more allusions to father than to mother. As a visual proof of this fact, the next plot is a histogram of the difference between the amount of allusions to father and mother along the 118 works (# allusions to father# allusions to mother):

The distribution is clearly right skeweed, which supports our previous results. Another way to see this fact is this last plot, which situates each autobiography in a scatter plot, where X-axis is the amount of allusions to father and Y-axis to mother. It is interactive, so you can navigate through it to see the details of each point (work):

Most of the points (works) are below the diagonal, which means that they contain more allusions to father than mother. Here you can find a full version of the previous plot.

I don’t have any explanation to this fact, just some simple hypothesis:

  • Fathers and mothers influence their children differently.
  • Fathers star in more anecdotes than mothers.
  • This is the effect of patriarchy (72% of authors was born in the XIX century)

Whatever it is the explanation, this experiment shows how easy is to do text mining with R. Special mention to purrr (to iterate eficiently over the set of works IDs), tidytext (to count the number of appearances of bigrams), highcharter (to do the interactive plot) and gutenbergr (to download the books). You can find the code here.

Flowers for Julia

No hables de futuro, es una ilusión cuando el Rock & Roll conquistó mi corazón (El Rompeolas, Loquillo y los Trogloditas)

In this post I create flowers inspired in the Julia Sets, a family of fractal sets obtained from complex numbers, after being iterated by a holomorphic function. Despite of the ugly previous definition, the mechanism to create them is quite simple:

  • Take a grid of complex numbers between -2 and 2 (both, real and imaginary parts).
  • Take a function of the form  f(z)=z^{n}+c setting parameters n and c.
  • Iterate the function over the complex numbers several times. In other words: apply the function on each complex. Apply it again on the output and repeat this process a number of times.
  • Calculate the modulus of the resulting number.
  • Represent the initial complex number in a scatter plot where x-axis correspond to the real part and y-axis to the imaginary one. Color the point depending on the modulus of the resulting number after applying the function f(z) iteratively.

This image corresponds to a grid of 9 million points and 7 iterations of the function f(z)=z^{5}+0.364716021116823:

To color the points, I pick a random palette from the top list of COLOURLovers site using the colourlovers package. Since each flower involves a huge amount of calculations, I use Reduce to make this process efficiently. More examples:

There are two little Julias in the world whom I would like to dedicate this post. I wish them all the best of the world and I am sure they will discover the beauty of mathematics. These flowers are yours.

The code is available here.

Crochet Patterns

¡Hay que ver cómo se estropean los cuerpos! (Pilar, my beloved grandmother)

My grandmother was a master of sewing. When she was young, she worked as dressmaker, and her profession became a hobby with the passage of time. I remember her doing cross-stitch, embroidering tablecloths and doing crochet. I have some of her artworks at home. She spent many hours patiently in silence, moving her knitting needles: my grandmother didn’t use to get bored. As she did with her threads, this drawing is done linking lines:

You can find the code here. If you check it, you will see that the stitches of drawings are defined by a function that I called pattern, which depends on some parameters that I define randomly. This is why each time you run it, you will get a different drawing:

From the technical side, I used accumulate function from purrr package, which makes loops faster and more efficient.

Drawings remind me those I created here, imitating the way that plants arrange their leaves. If you are interesting in using R to create art, check out this free DataCamp’s project.

Tweetable Mathematical Art With R

Sin ese peso ya no hay gravedad
Sin gravedad ya no hay anzuelo
(Mira cómo vuelo, Miss Caffeina)

I love messing around with R to generate mathematical patterns. I always get surprised doing it and gives me lot of satisfaction. I also learn lot of things doing it: not only about R, but also about mathematics. It is one of my favourite hobbies. Some time ago, I published this post showing some drawings, each of them generated with less than 280 characters of code, to be shared on Twitter. This post came to appear in Hacker News, which provoked an incredible peak on visits to my blog. Some comments in the Hacker News entry are very interesting.

This Summer I delved into this concept of Tweetable Art publishing several drawings together with the R code to generate them. In this post I will show some.

Vertiginous Spiral

I came up with this image inspired by this nice pattern. It is a turtle graphic inspired pattern but instead of drawing lines I use geom_polygon to colour the resulting image in black and white:

Code:

library(tidyverse)
df <- data.frame(x=0, y=0)
for (i in 2:500){
  df[i,1] <- df[i-1,1]+((0.98)^i)*cos(i)
  df[i,2] <- df[i-1,2]+((0.98)^i)*sin(i)   
}
ggplot(df, aes(x,y)) + 
  geom_polygon()+
  theme_void()

Slight modifications of the code can generate appealing patterns like this:

Marine Creature

A combination of sines and cosines. It reminds me a jellyfish:

Code:

library(tidyverse)
seq(from=-10, to=10, by = 0.05) %>%
  expand.grid(x=., y=.) %>%
  ggplot(aes(x=(x^2+pi*cos(y)^2), y=(y+pi*sin(x)))) +
  geom_point(alpha=.1, shape=20, size=1, color="black")+
  theme_void()+coord_fixed()

Summoning Cthulhu

The name is inspired in an answer from Mara Averick to this tweet. It is a modification of the marine creature in polar coordinates:

Code:

library(tidyverse)
seq(-3,3,by=.01) %>%
  expand.grid(x=., y=.) %>%
  ggplot(aes(x=(x^3-sin(y^2)), y=(y^3-cos(x^2)))) +
  geom_point(alpha=.1, shape=20, size=0, color="white")+
  theme_void()+
  coord_fixed()+
  theme(panel.background = element_rect(fill="black"))+
  coord_polar()

Naive Sunflower

Sunflowers arrange their seeds according a mathematical pattern called phyllotaxis, whic inspires this image. If you want to create your own flowers, you can do this Datacamp’s project. It’s free and will introduce you to the amazing world of ggplot2, my favourite package to create images:

Code:

library(ggplot2)
a=pi*(3-sqrt(5))
n=500
ggplot(data.frame(r=sqrt(1:n),t=(1:n)*a),
       aes(x=r*cos(t),y=r*sin(t)))+
  geom_point(aes(x=0,y=0),
             size=190,
             colour="violetred")+
  geom_point(aes(size=(n-r)),
             shape=21,fill="gold",
             colour="gray90")+
  theme_void()+theme(legend.position="none")

Silk Knitting

It is inspired by this other pattern. A lot of almost transparent white points ondulating according to sines and cosines on a dark coloured background:

Code:

library(tidyverse)
seq(-10, 10, by = .05) %>%
  expand.grid(x=., y=.) %>%
  ggplot(aes(x=(x+sin(y)), y=(y+cos(x)))) +
  geom_point(alpha=.1, shape=20, size=0, color="white")+
  theme_void()+
  coord_fixed()+
  theme(panel.background = element_rect(fill="violetred4"))

Try to modify them and generate your own patterns: it is a very funny way to learn R.

Note: in order to make them better readable, some of the pieces of code below may have more than 280 characters but removing unnecessary characters (blanks or carriage return) you can reduce them to make them tweetable.