Women in Orchestras

I believe in the truth of fairy-tales more than I believe in the truth in the newspaper (Lotte Reiniger)

In my opinion, this graph is a visual demonstration that we live in a male chauvinist world.


In this experiment I download the members of ten top orchestras of the world with the amazing rvest package. After cleaning texts, I obtain the gender of names with genderizeR package as I did here. Since I only take into account names genderized with high probability, these numbers cannot be exact. Apart of this, the plot speaks by itself.

read_html("http://www.berliner-philharmoniker.de/en/orchestra/") %>%
html_nodes(".name") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
paste(collapse=" ") %>%
findGivenNames() -> berliner
saveRDS(berliner, file="berliner.RDS")
read_html("https://www.concertgebouworkest.nl/en/musicians") %>%
html_nodes(".u-padding--b2") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("\\s+", " ", .) %>%
paste(collapse=" ") %>%
findGivenNames() -> rco
saveRDS(rco, file="rco.RDS")
read_html("http://www.philharmonia.spb.ru/en/about/orchestra/zkrasof/contents/") %>%
html_nodes(".td") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
.[23] %>%
findGivenNames() -> spb
saveRDS(spb, file="spb.RDS")
read_html("http://ocne.mcu.es/conoce-a-la-ocne/orquesta-nacional-de-espana/componentes/") %>%
html_nodes(".col-main") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
gsub("([[:lower:]])([[:upper:]][[:lower:]])", "\\1 \\2", .) %>%
findGivenNames() -> one
saveRDS(one, file="one.RDS")
read_html("http://www.gewandhausorchester.de/en/orchester/") %>%
html_nodes("#content") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
findGivenNames() -> leipzig
saveRDS(leipzig, file="leipzig.RDS")
read_html("http://www.wienerphilharmoniker.at/orchestra/members") %>%
html_nodes(".ModSuiteMembersC") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n,\t,*]"," ", .) %>%
gsub("\\s+", " ", .) %>%
gsub("([[:lower:]])([[:upper:]][[:lower:]])", "\\1 \\2", .) %>%
paste(collapse=" ") %>%
.[-18] %>%
findGivenNames() -> wiener
saveRDS(wiener, file="wiener.RDS")
read_html("http://www.laphil.com/philpedia/orchestra-roster") %>%
html_nodes(".view-content") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("\\s+", " ", .) %>%
.[1] %>%
findGivenNames() -> laphil
saveRDS(laphil, file="laphil.RDS")
read_html("http://nyphil.org/about-us/meet/musicians-of-the-orchestra") %>%
html_nodes(".resp-tab-content-active") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
findGivenNames() -> nyphil
saveRDS(nyphil, file="nyphil.RDS")
sapply(urls, function(x)
read_html(x) %>%
html_nodes(".clearfix") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n,\t,*]"," ", .) %>%
gsub("\\s+", " ", .)
}) %>% paste(., collapse=" ") %>%
findGivenNames() -> lso
saveRDS(lso, file="lso.RDS")
read_html("http://www.osm.ca/en/discover-osm/orchestra/musicians-osm") %>%
html_nodes("#content-column") %>%
html_text(trim=TRUE) %>%
iconv("UTF-8") %>%
gsub("[\r,\n]"," ", .) %>%
gsub("\\s+", " ", .) %>%
findGivenNames() -> osm
saveRDS(osm, file="osm.RDS")
rbind(c("berliner", "Berliner Philharmoniker"),
c("rco", "Royal Concertgebouw Amsterdam"),
c("spb", "St. Petersburg Philharmonic Orchestra"),
c("one", "Orquesta Nacional de España"),
c("leipzig", "Gewandhaus Orchester Leipzig"),
c("wiener", "Wiener Philarmoniker"),
c("laphil", "The Los Angeles Philarmonic"),
c("nyphil", "New York Philarmonic"),
c("lso", "London Symphony Orchestra"),
c("osm", "Orchestre Symphonique de Montreal")) %>% as.data.frame()-> Orchestras
colnames(Orchestras)=c("Id", "Orchestra")
list.files(getwd(),pattern = ".RDS") %>%
readRDS(x) %>% as.data.frame(stringsAsFactors = FALSE) %>% cbind(Id=gsub(".RDS", "", x))
) %>% do.call("rbind", .) -> all
all %>% mutate(probability=as.numeric(probability)) %>%
filter(probability > 0.9 & count > 15) %>%
filter(!name %in% c("viola", "tuba", "harp")) %>%
group_by(Id, gender) %>%
all %>% filter(gender=="female") %>% mutate(females=Total) %>% select(Id, females) -> females
all %>% group_by(Id) %>% summarise(Total=sum(Total)) -> total
inner_join(total, females, by = "Id") %>% mutate(po_females=females/Total) %>%
inner_join(Orchestras, by="Id")-> df
plot.background = element_rect(fill="gray85"),
panel.background = element_rect(fill="gray85"),
panel.grid.major.x=element_line(colour="white", size=2),
axis.title = element_blank(),
axis.line.y = element_line(size = 2, color="black"),
axis.text = element_text(colour="black", size=18),
plot.title = element_text(size = 35, face="bold", margin=margin(10,0,10,0), hjust=0))
ggplot(df, aes(reorder(Orchestra, po_females), po_females)) +
geom_bar(stat="identity", fill="darkviolet", width=.5)+
scale_y_continuous(labels = percent, expand = c(0, 0), limits=c(0,.52))+
geom_text(aes(label=sprintf("%1.0f%%", 100*po_females)), hjust=-0.05, size=6)+
ggtitle(expression(atop(bold("Women in Orchestras"), atop("% of women among members", "")))) +

11 thoughts on “Women in Orchestras

  1. You just proved that in ten of the world’s biggest orchestras there are more men than women. From there to stating that “we live in a male very chauvinist world” is a huge leap.

    1. Thanks for your comment. Attending only to numbers, it just shows that “in ten of the world’s biggest orchestras there are more men than women” but things have an explanation. I think male chauvinism can explain quite well why this happens. This was the meaning of the sentence you are regarding.

      1. So if I make the same plot but for kindergarden teachers can I then posit that we live in a female dominated world?

        I normally like your posts, but I think the gender-story here isn’t genuine and comes across to me like bandwaggoning.

  2. Yes, it’s been a long-standing problem in many music genres. Take a look at the number of female jazz performers, especially if you exclude vocalists. Take a look at the number of female rock band members. Now take a look at the ratio of female to male musicians in elementary school and high school bands/orchestras. (hint: the latter is 50 to 60% female). I remember that the Boston Symphony only started accepting a reasonable number of females after they instituted ‘blind’ auditions (candidate behind a screen) – AND made women take off their high heels! Von Karajan almost quit the Berlin Phil. when they refused to allow a couple of his woman proteges to join.

  3. Very interesting concept in the way you gather the data and visualize it. Thanks for posting. However, raw numbers do not prove the premise which seems to be what you were trying to do. Admittedly you may be correct but the data is not sufficient to prove it. Being observational data, there are far too many variables to draw a conclusion. There was a study done at UC Berkeley that showed men being admitted at higher numbers than women. It was claimed this was sexism. It turned out that when they broke it out by the college/major, men were applying for easier majors where more applicants were accepted. When this was factored in, they found a fairly substantial bias in favor of women.

    1. I agree the graph does not prove the premise. It just supports it. I sometimes wonder which variables would be needed to prove behind these numbers there are sexism. Imagine I obtain the gender of all applicants to Wiener Philarmoniker and I obtain only 10% are women: Would it mean there is no sexism inside the orchestra? Not at all. Maybe women don’t apply to enter because they know will never be accepted. What would you do to prove it? Thanks a lot for your comment

Leave a Reply

Your email address will not be published. Required fields are marked *