Years at that movies are set in#
This directory aims to visualize the years that movies are set in vs. when they are produced/released.
Motivation#
I was curious about the balance between media set in the future vs movies set in the past. Luckily, people on Wikipedia have tagged movies both for when they are produced/released, as well as whether they are set in certain years, decades or centuries.
For example, Category:Films_set_in_1945
contains list of movies set in 1945, while Category:2022_films
contains list of movies from 2022.
Requirements#
Python & Jupyter were used.
In CLI, do:
pip install -r requirements.txt
Data#
The data are scraped with Wikipedia-API
using category pages.
The produced years are from Category:Films_by_year
.
The set-in years are from 3 categories:
Category:Films_by_century_of_setting
Category:Films_by_decade_of_setting
Category:Films_by_year_of_setting
The scrape.ipynb
notebook produced 3 files:
films_by_produced.csv
: movies and when they are producedfilms_by_set_in.csv
: movies and when they are set infilms_set_in_and_produced.csv
: combined between the two data frames (i.e. not all movies have set-in years)
Results#
The figures are discussed below. See visualize.ipynb
notebook for further details on how they are generated.

Fig. 4 Distribution of set-in years, shown specifically for only 1800 and 2050.#
Fig. 4 shows distribution of movies by set-in years. Selected periods (e.g. World Wars) and events (e.g. Moon Landing 1969) are annotated. There are most movies set in World War II period (1939 - 1945), with the most movies in 1944. The second top (after 1944) is 1999, which is possibly because this is right before Y2K.
There are consistently high number of movies set between 1950 (after the WWII period) and 2012 (aka. defunct apocalypse). There’s surprisingly more movies set in 2024 than 2023, which is annotated at the present time of the time the data collection and visualization. After 2012, there seems to be a decline in the number of movies, the sharpest decline is around 2026.

Fig. 5 Time difference (y-axis) between the set-in and produced years, by different produced years (x-axis).#
Fig. 5 shows the time difference between the 2 different years.
Basically this is year_setin - year_produced
.
Positive difference means movies are set in the future, compared to their release/produced years (e.g. post-apocalyptic movies);
negative means movies are set in the past (e.g. war movies);
while 0 means movies are set in the corresponding present time.
From left to right, the subpanels show the movies set in year, decade and century, separately from the different Wikipedia categories. Note that not all movies set in a specific year is classified by Wikipedia in a decade or century.
The line plots on the left of each of these subpanels show the overall density distribution. They show that there are more movies set in the past than in the present or future, and the tail of the past is longer and heavier than the tail for the future.
As we progress through the production years, more movies are generally set in the past (note that the noticeable declining line in the year pane is around the WWII period). If we look at the decade and century panels, more movies, albeit not as prominent, are set in the future as well. This is possibly because when movies are set in the future, it can be more attractive to stay ambiguous and say it is set in the years 2500s than an exact year 2522.
Discussion#
These 2 results suggest that there are generally more movies set in the past than in the future, though there seems to be a slower increase of movies set in the future as we progress through production years.
Warning
However, we cannot conclude that there are more movies set in the past than the future. We can only say that according to Wikipedia, there are more movies tagged as set in the past than in the future.
This is possibly a significant limitation in the data, and possibly also because of the choices of Wikipedia categories that I made.
Not all movies are tagged with a set-in year/decade/century by Wikipedia.
The future set-in movies do not actually have any explicit year/decade/century to begin with, and it is just ambiguously set in a post-apocalyptic world.
An example of reason 2 is the 2015 film Mad Max: Fury Road:
Mad Max: Fury Road is a 2015 Australian post-apocalyptic dystopian action film co-written, co-produced, and directed by George Miller. […] Set in a post-apocalyptic desert wasteland where petrol and water are scarce commodities, Fury Road follows Max Rockatansky, who joins forces with Imperator Furiosa against cult leader Immortan Joe and his army, leading to a lengthy road battle.
And the category list is (I’m only listing the ones that have time in it):
Categories: 2015 films | 2010s road movies | 2015 science fiction action films | […] | 2010s chase films | 2010s English-language films | 2010s feminist films | […] | Films set in deserts | Mad Max films | 2010s dystopian films | Australian post-apocalyptic films | […]
This one does not even have Films set in the future
, while the 1979 version Mad Max
is categorized with Films set in the future
, but also does not have an explicit set-in year.
Hence, these ones did not make the cut in the data. Future analyses could broaden the categories to analyze these tags in addition.