R in your organization

Home > French elections 2012 - quantitative analysis

French elections 2012 - quantitative analysis

Let’s demonstrate R capacities with the french presidential election in 2012.

Data come from the website of the Interior ministry

The data are HTML based, so we use R functions in order to download all the web pages and extract the data (we use getUrl from package ’RCurl’ and readHTMLTable from Package ’XML’).

It allows us to download the data of 36000+ french towns.

Example for 1 town (figures are the number of votes) :

LE_PEN 126

(8 candidates + abstention) in the first round + (2 candidates + abstention) in the second one = 12 * 36 000 = about 430 000 data.

This is not huge but it is already too much in order to use Excel.

  • Histograms of towns

One of the major problem of this study is : we have data per town. And their sizes varies dramatically, so we cannot use the "standard" functions. We have to weight the computations with the weight (number of people who vote) of each town.

Towns size do not follow a simple gaussian distribution, but rather a distribution close to an exponential distribution.

Taking the log of the voters, we see a distribution close to a log normal one.

  • Principal component analysis (PCA) Two examples with R
    PNG - 36 000+ towns on the PCA graph (first two dimensions)
    36 000+ towns on the PCA graph (first two dimensions)
    (click to zoom) The display shows its limits... but it can be used to identify "extreme" towns which vote almost exclusively for one of the candidate
  • Correlograms

It is a simple mean to visualize correlation matrices : positive in blue, negative in red.

  • Correlograms with social factors

The french institute of statistics release numerous data of social factors for each towns (inhabitants ages, job, unemployment, salary, etc.)
R helps us to compute and plots the correlations matrices.

PNG - Correlogram by jobs
Correlogram by jobs
PNG - By degree obtained
By degree obtained
PNG - Correlogram by salary
Correlogram by salary
  • Cartograms

Cartograms shows towns with their surface proportional to the number of inhabitants. This type of graph is essential when the studies depend of the number of people rather than geography (client sales for instance). R loads the cartogram and color it proportionaly to the results.

(click to zoom)

JPEG -  Cartogram of France
Cartogram of France
The area of each town is proportionnal to the inhabitants
PNG - Cartogram of non-voters
Cartogram of non-voters
Non voters are concentrated in cities
PNG - Holland’s results
Holland’s results
Holland scores best in South West and towns. Less in west suburb of Paris (Sarkozy’s stronghold), Alsace, South East
PNG - Sarkozy’s results
Sarkozy’s results
PNG - Le Pen’s results
Le Pen’s results
Le Pen’s results are better in the countryside and in South-East