Project Context

This project explores the variables in a given dataset that strongly correlate to box office success and provides a deeper understanding of what makes movies successful.

The initial hypotheses are as follows:

The film budget is highly correlated to gross revenue.
The number of IMDB votes is highly correlated to gross revenue.

The dataset used for the project, which covers four decades of movie data from 1980 to 2020, was obtained from Kaggle.

All data analytics and graphics presented in this project were created using Jupyter Notebook.

Data Analysis Process

This project follows a three-step data analysis process, consisting of:

<aside> 💡 A GitHub repository containing all relevant code can be found here

</aside>

Analysis

Step 1: Importing Relevant Python Libraries and Data

01. Importing relevant libraries to Jupyter Notebook

Input:

# import libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cpi