In my opinion, this graph is a visual demonstration that we live in a male chauvinist world.
In this experiment I download the members of ten top orchestras of the world with the amazing rvest
package. After cleaning texts, I obtain the gender of names with genderizeR
package as I did here. Since I only take into account names genderized with high probability, these numbers cannot be exact. Apart of this, the plot speaks by itself.
setwd("YOUR WORKING DIRECTORY HERE") library(rvest) library(dplyr) library(genderizeR) read_html("http://www.berliner-philharmoniker.de/en/orchestra/") %>% html_nodes(".name") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% paste(collapse=" ") %>% findGivenNames() -> berliner saveRDS(berliner, file="berliner.RDS") read_html("https://www.concertgebouworkest.nl/en/musicians") %>% html_nodes(".u-padding--b2") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("\\s+", " ", .) %>% paste(collapse=" ") %>% findGivenNames() -> rco saveRDS(rco, file="rco.RDS") read_html("http://www.philharmonia.spb.ru/en/about/orchestra/zkrasof/contents/") %>% html_nodes(".td") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% .[23] %>% findGivenNames() -> spb saveRDS(spb, file="spb.RDS") read_html("http://ocne.mcu.es/conoce-a-la-ocne/orquesta-nacional-de-espana/componentes/") %>% html_nodes(".col-main") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% gsub("([[:lower:]])([[:upper:]][[:lower:]])", "\\1 \\2", .) %>% findGivenNames() -> one saveRDS(one, file="one.RDS") read_html("http://www.gewandhausorchester.de/en/orchester/") %>% html_nodes("#content") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% findGivenNames() -> leipzig saveRDS(leipzig, file="leipzig.RDS") read_html("http://www.wienerphilharmoniker.at/orchestra/members") %>% html_nodes(".ModSuiteMembersC") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n,\t,*]"," ", .) %>% gsub("\\s+", " ", .) %>% gsub("([[:lower:]])([[:upper:]][[:lower:]])", "\\1 \\2", .) %>% paste(collapse=" ") %>% .[-18] %>% findGivenNames() -> wiener saveRDS(wiener, file="wiener.RDS") read_html("http://www.laphil.com/philpedia/orchestra-roster") %>% html_nodes(".view-content") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("\\s+", " ", .) %>% gsub("(?% .[1] %>% findGivenNames() -> laphil saveRDS(laphil, file="laphil.RDS") read_html("http://nyphil.org/about-us/meet/musicians-of-the-orchestra") %>% html_nodes(".resp-tab-content-active") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% gsub("(?% findGivenNames() -> nyphil saveRDS(nyphil, file="nyphil.RDS") urls=c("http://lso.co.uk/orchestra/players/strings.html", "http://lso.co.uk/orchestra/players/woodwind.html", "http://lso.co.uk/orchestra/players/brass.html", "http://lso.co.uk/orchestra/players/percussion-harps-and-keyboards.html") sapply(urls, function(x) { read_html(x) %>% html_nodes(".clearfix") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n,\t,*]"," ", .) %>% gsub("\\s+", " ", .) }) %>% paste(., collapse=" ") %>% findGivenNames() -> lso saveRDS(lso, file="lso.RDS") read_html("http://www.osm.ca/en/discover-osm/orchestra/musicians-osm") %>% html_nodes("#content-column") %>% html_text(trim=TRUE) %>% iconv("UTF-8") %>% gsub("[\r,\n]"," ", .) %>% gsub("\\s+", " ", .) %>% findGivenNames() -> osm saveRDS(osm, file="osm.RDS") rbind(c("berliner", "Berliner Philharmoniker"), c("rco", "Royal Concertgebouw Amsterdam"), c("spb", "St. Petersburg Philharmonic Orchestra"), c("one", "Orquesta Nacional de España"), c("leipzig", "Gewandhaus Orchester Leipzig"), c("wiener", "Wiener Philarmoniker"), c("laphil", "The Los Angeles Philarmonic"), c("nyphil", "New York Philarmonic"), c("lso", "London Symphony Orchestra"), c("osm", "Orchestre Symphonique de Montreal")) %>% as.data.frame()-> Orchestras colnames(Orchestras)=c("Id", "Orchestra") list.files(getwd(),pattern = ".RDS") %>% lapply(function(x) readRDS(x) %>% as.data.frame(stringsAsFactors = FALSE) %>% cbind(Id=gsub(".RDS", "", x)) ) %>% do.call("rbind", .) -> all all %>% mutate(probability=as.numeric(probability)) %>% filter(probability > 0.9 & count > 15) %>% filter(!name %in% c("viola", "tuba", "harp")) %>% group_by(Id, gender) %>% summarize(Total=n())->all all %>% filter(gender=="female") %>% mutate(females=Total) %>% select(Id, females) -> females all %>% group_by(Id) %>% summarise(Total=sum(Total)) -> total inner_join(total, females, by = "Id") %>% mutate(po_females=females/Total) %>% inner_join(Orchestras, by="Id")-> df library(ggplot2) library(scales) opts=theme(legend.position="none", plot.background = element_rect(fill="gray85"), panel.background = element_rect(fill="gray85"), panel.grid.major.y=element_blank(), panel.grid.major.x=element_line(colour="white", size=2), panel.grid.minor=element_blank(), axis.title = element_blank(), axis.line.y = element_line(size = 2, color="black"), axis.text = element_text(colour="black", size=18), axis.ticks=element_blank(), plot.title = element_text(size = 35, face="bold", margin=margin(10,0,10,0), hjust=0)) ggplot(df, aes(reorder(Orchestra, po_females), po_females)) + geom_bar(stat="identity", fill="darkviolet", width=.5)+ scale_y_continuous(labels = percent, expand = c(0, 0), limits=c(0,.52))+ geom_text(aes(label=sprintf("%1.0f%%", 100*po_females)), hjust=-0.05, size=6)+ ggtitle(expression(atop(bold("Women in Orchestras"), atop("% of women among members", "")))) + coord_flip()+opts
In your great chart, I view also a geographical bias. Thank you.
Thanks Antonio!
You just proved that in ten of the world’s biggest orchestras there are more men than women. From there to stating that “we live in a male very chauvinist world” is a huge leap.
Thanks for your comment. Attending only to numbers, it just shows that “in ten of the world’s biggest orchestras there are more men than women” but things have an explanation. I think male chauvinism can explain quite well why this happens. This was the meaning of the sentence you are regarding.
So if I make the same plot but for kindergarden teachers can I then posit that we live in a female dominated world?
I normally like your posts, but I think the gender-story here isn’t genuine and comes across to me like bandwaggoning.
I appreciate a lot your opinion. I sincerely think behind these numbers there is a gender story. Look at this:
or this:
Maybe only men try to work in top orchestras. I don’t really think so.
Yes, it’s been a long-standing problem in many music genres. Take a look at the number of female jazz performers, especially if you exclude vocalists. Take a look at the number of female rock band members. Now take a look at the ratio of female to male musicians in elementary school and high school bands/orchestras. (hint: the latter is 50 to 60% female). I remember that the Boston Symphony only started accepting a reasonable number of females after they instituted ‘blind’ auditions (candidate behind a screen) – AND made women take off their high heels! Von Karajan almost quit the Berlin Phil. when they refused to allow a couple of his woman proteges to join.
Very interesting concept in the way you gather the data and visualize it. Thanks for posting. However, raw numbers do not prove the premise which seems to be what you were trying to do. Admittedly you may be correct but the data is not sufficient to prove it. Being observational data, there are far too many variables to draw a conclusion. There was a study done at UC Berkeley that showed men being admitted at higher numbers than women. It was claimed this was sexism. It turned out that when they broke it out by the college/major, men were applying for easier majors where more applicants were accepted. When this was factored in, they found a fairly substantial bias in favor of women.
I agree the graph does not prove the premise. It just supports it. I sometimes wonder which variables would be needed to prove behind these numbers there are sexism. Imagine I obtain the gender of all applicants to Wiener Philarmoniker and I obtain only 10% are women: Would it mean there is no sexism inside the orchestra? Not at all. Maybe women don’t apply to enter because they know will never be accepted. What would you do to prove it? Thanks a lot for your comment