In this post I want to show another public data source related to automotive industry. OICA, the International Organization of Motor Vehicle Manufacturers, provides a series of statistics on its website, including sales and production statistics. The comprises all manufacturers world-wide, and considers passenger as well as commercial vehicles.
OICA production statistics can be accessed here: http://www.oica.net/category/production-statistics/
I scraped OICA’s website for production statistics and compressed all values into a single table, containing production output by country from 2005 to 2018.
In the R code below I read in the data from an .xls-file and visualize the statistics for relevant countries. The packages applied are readxl, dplyr and ggplot2. In combination with dplyr I the grepl function for filtering out certain countries only.
library(readxl)
library(dplyr)
data_df = as.data.frame(read_xls("oica.xls"))
data_df = dplyr::filter(data_df,year>=2005)
head(data_df)
## year country total
## 1 2018 Argentina 466649
## 2 2018 Austria 164900
## 3 2018 Belgium 308493
## 4 2018 Brazil 2879809
## 5 2018 Canada 2020840
## 6 2018 China 27809196
library(ggplot2)
data_df = dplyr::filter(data_df, grepl('Germany|China|Japan|USA', country))
ggplot(data_df) + geom_col(mapping=aes(x=year,y=total/1000,fill=country)) +
scale_fill_manual(values = c(Germany="red",
USA = "black",
China = "orange",
Japan= "blue")) +
labs(title="annual passenger car and commercial vehicle production output",
subtitle="data by OICA, for 2005 - 2019 (Germany, Japan, USA, China)") +
xlab("year") +
ylab("production output [thousands of units]")
Data scientist focusing on simulation, optimization and modeling in R, SQL, VBA and Python
Leave a Reply