Final_Project_Jaden_Cowley

Human Freedom Index Analysis

The Human Freedom Index (HFI) provides a comprehensive measure of personal and economic freedoms across 165 jurisdictions worldwide, offering insight into the institutional and socioeconomic factors that shape liberty. This analysis explores the relationships between these freedoms and their geographic distribution, focusing on how freedom scores vary by region, the correlation between personal and economic freedoms, and the prevalence of high-scoring countries in specific areas. By leveraging statistical methods and visualizations, this study aims to uncover patterns and trends that inform our understanding of global freedom and its connection to governance and transparency.

Loading and Tidying the Data

library(readxl)
library(janitor)

Attaching package: 'janitor'
The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyr)


# Load data
data_path <- "/Users/jadencowley/Desktop/Analysis of Human Freedom Index/Final_Project_Jaden_Cowley/2023-Human-Freedom-Index-Data.xlsx"
hfi_data <- read_excel(data_path)
New names:
• `data` -> `data...14`
• `data` -> `data...17`
• `data` -> `data...20`
• `data` -> `data...22`
• `data (five year total)` -> `data (five year total)...50`
• `data (five year total)` -> `data (five year total)...52`
• `data` -> `data...76`
• `data` -> `data...78`
• `data` -> `data...80`
• `data` -> `data...82`
• `data` -> `data...84`
• `data` -> `data...99`
• `data` -> `data...101`
• `data` -> `data...103`
• `data` -> `data...107`
• `data` -> `data...109`
• `data` -> `data...111`
# Clean column names
hfi_data <- janitor::clean_names(hfi_data)

# View first rows
head(hfi_data)
# A tibble: 6 × 146
   year abbreviation country   region           human_freedom human_freedom_rank
  <dbl> <chr>        <chr>     <chr>                    <dbl>              <dbl>
1  2021 ALB          Albania   Eastern Europe            7.67                 49
2  2021 DZA          Algeria   Middle East & N…          4.82                155
3  2021 AGO          Angola    Sub-Saharan Afr…          5.76                122
4  2021 ARG          Argentina Latin America &…          6.85                 77
5  2021 ARM          Armenia   Caucasus & Cent…          7.99                 33
6  2021 AUS          Australia Oceania                   8.52                 14
# ℹ 140 more variables: human_freedom_quartile <dbl>,
#   ai_procedural_justice <dbl>, aii_civil_justice <dbl>,
#   aiii_criminal_justice <dbl>, v_dem_rule_of_law <dbl>, a_rule_of_law <dbl>,
#   bi_homicide <dbl>, data_14 <dbl>, biia_disappearances <dbl>,
#   biib_violent_conflicts <dbl>, data_17 <dbl>,
#   biic_organised_conflicts <dbl>, biid_terrorism_fatalities <dbl>,
#   data_20 <dbl>, biie_terrorism_injuries <dbl>, data_22 <dbl>, …

The dataset was loaded from an Excel file using the readxl package. Column names were cleaned using janitor::clean_names() to standardize them (e.g., replacing spaces with underscores). This should ensure easier referencing and, generally, avoids errors in subsequent analysis.

The tidying command implement prepares the raw data for further manipulation. No rows or columns were removed yet; the goal was simply to standardize the dataset’s structure.

Loading and Tidying the Data

# Example of dropping unnecessary columns and rows
hfi_data <- hfi_data %>%
    select(country, region, human_freedom, economic_freedom, personal_freedom) %>%
    drop_na()

This command visualizes the only selected columns directly relevant to the analysis (country, region, human_freedom, economic_freedom, personal_freedom). Missing values (NA) were removed to avoid errors in statistical computations, as well as to narrow the scope of the research goals.

Data Dictionary

Variable Name Description
country Name of the country
region Geographical region of the country
human_freedom_score Composite score measuring overall freedom
economic_freedom Score measuring economic freedom (0-10)
personal_freedom Score measuring personal freedom (0-10)

This table documents each variable’s name and definition, ensuring clarity for readers and reproducibility of the analysis.

hfi_data %>%
    group_by(region) %>%
    summarize(mean_score = mean(human_freedom), .groups = "drop")
# A tibble: 10 × 2
   region                        mean_score
   <chr>                              <dbl>
 1 Caucasus & Central Asia             6.67
 2 East Asia                           7.89
 3 Eastern Europe                      7.81
 4 Latin America & the Caribbean       7.37
 5 Middle East & North Africa          5.62
 6 North America                       8.86
 7 Oceania                             8.09
 8 South Asia                          6.34
 9 Sub-Saharan Africa                  6.28
10 Western Europe                      8.77

Data Analysis

Mean Human Freedom Scores by Region

The mean Human Freedom Score was calculated for each region to analyze regional differences in freedom levels. The results reveal distinct patterns:

  • North America (8.86) and Western Europe (8.77) lead in average scores, reflecting strong protections for freedom in these regions.

  • East Asia (7.89) and Eastern Europe (7.81) also exhibit relatively high scores.

  • Middle East & North Africa (5.62), Sub-Saharan Africa (6.28), and South Asia (6.34) show the lowest average scores, indicating significant freedom deficits.

These findings highlight global disparities in freedom, suggesting that regional factors, including governance and socioeconomic structures, play a substantial role in shaping individual freedoms.

Boxplot of Human Freedom Scores by Region

library(ggplot2)

ggplot(hfi_data, aes(x = region, y = human_freedom)) +
    geom_boxplot() +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    labs(title = "Human Freedom Scores by Region", y = "Human Freedom Score")

The boxplot illustrates the distribution of Human Freedom Scores across regions. Key observations include:

  • North America, Western Europe, and Oceania exhibit high median scores with minimal variability, recognizing consistent protection of freedoms in these regions.

  • Sub-Saharan Africa and the Middle East & North Africa show lower median scores with a wider range, reflecting greater variability and, in some cases, substantial restrictions on freedoms.

  • Eastern Europe and East Asia display moderate scores, with some overlap in their interquartile ranges.

This visualization highlights the significant regional disparities in Human Freedom Scores, aligning with the hypothesis that governance and institutional factors heavily influence freedom levels.

Correlation Between Economic Freedom and Personal Freedom

cor(hfi_data$economic_freedom, hfi_data$personal_freedom)
[1] 0.6953828

The correlation coefficient between Economic Freedom and Personal Freedom was calculated as 0.6953828. This indicates a strong positive relationship, meaning that higher levels of economic freedom are generally associated with higher levels of personal freedom. The strength of this correlation supports the hypothesis that these two forms of freedom are interdependent, with improvements in one likely reinforcing the other.

Economic Freedom vs. Personal Freedom (via Scatterplot)

ggplot(hfi_data, aes(x = economic_freedom, y = personal_freedom)) +
    geom_point() +
    geom_smooth(method = "lm", col = "blue") +
    labs(title = "Economic vs Personal Freedom", x = "Economic Freedom", y = "Personal Freedom")
`geom_smooth()` using formula = 'y ~ x'

The scatterplot visualizes the relationship between Economic Freedom (x-axis) and Personal Freedom (y-axis). A clear positive trend is evident, as recognized by the regression line (blue), indicating that higher Economic Freedom is strongly associated with higher Personal Freedom. This supports the correlation result (0.695) and aligns with the hypothesis that economic systems promoting greater freedom tend to foster personal liberties as well.

The clustering of points along the regression line suggests a consistent relationship, with only a few outliers deviating significantly from the trend. This relationship underscores the interdependence of these two dimensions of freedom.

Regional Distribution of Top-Quartile Countries

top_countries <- hfi_data %>%
    filter(human_freedom >= quantile(human_freedom, 0.75))

top_countries %>%
    count(region)
# A tibble: 8 × 2
  region                            n
  <chr>                         <int>
1 Caucasus & Central Asia           1
2 East Asia                        83
3 Eastern Europe                  182
4 Latin America & the Caribbean    68
5 North America                    44
6 Oceania                          44
7 Sub-Saharan Africa               10
8 Western Europe                  390

To examine the distribution of countries in the top quartile of Human Freedom Scores, the data was filtered for scores above the 75th percentile. The count of countries within each region was then calculated:

  • Western Europe leads with nearly 390 countries in the top quartile.

  • Eastern Europe (182) and East Asia (83) follow, also showcasing significant contributions to the top quartile.

  • Regions such as Sub-Saharan Africa (10) and Caucasus & Central Asia (1) show minimal representation, suggesting greater freedom challenges in these areas.

This analysis reveals substantial disparities in regional representation among the highest Human Freedom Scores, emphasizing the role of governance and institutional quality in fostering freedoms.

Regions of Countries in the Top Quartile of Freedom Scores

# Compute counts for each region
region_counts <- top_countries %>%
    count(region)

# Create the bar plot
ggplot(region_counts, aes(x = reorder(region, -n), y = n)) +
    geom_bar(stat = "identity", fill = "lightblue", color = "black") +
    labs(
        title = "Regions of Countries in the Top Quartile of Freedom Scores",
        x = "Region",
        y = "Count"
    ) +
    theme_minimal() +
    theme(
        axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
        plot.title = element_text(size = 16, face = "bold", hjust = 0.5),
        axis.title.x = element_text(size = 12),
        axis.title.y = element_text(size = 12)
    )

The bar chart shows the distribution of countries in the top quartile of Human Freedom Scores across regions. Key findings include:

  • Western Europe dominates, with nearly 400 countries represented, emphasizing its strong governance and institutional transparency.

  • Eastern Europe and East Asia follow, though with significantly lower counts.

  • Regions like Sub-Saharan Africa and Caucasus & Central Asia have minimal representation, reflecting broader challenges in achieving high freedom scores.

This visualization overviews regional disparities, reinforcing the hypothesis that freedom is unevenly distributed globally.

Summary Statistics of Key Variables

summary(hfi_data)
   country             region          human_freedom   economic_freedom
 Length:3280        Length:3280        Min.   :2.960   Min.   :2.470   
 Class :character   Class :character   1st Qu.:6.128   1st Qu.:5.980   
 Mode  :character   Mode  :character   Median :7.140   Median :6.800   
                                       Mean   :7.068   Mean   :6.711   
                                       3rd Qu.:8.150   3rd Qu.:7.550   
                                       Max.   :9.320   Max.   :9.190   
 personal_freedom
 Min.   :2.080   
 1st Qu.:6.080   
 Median :7.530   
 Mean   :7.323   
 3rd Qu.:8.660   
 Max.   :9.690   

A summary of the dataset provides key descriptive statistics for the numeric variables:

  • Human Freedom Score:

    • Minimum: 2.96, Maximum: 9.32

    • Mean: 7.068, Median: 7.14

    • The majority of countries score above 6.12 (1st Quartile), with a top quartile exceeding 8.15.

  • Economic Freedom:

    • Minimum: 2.47, Maximum: 9.19

    • Mean: 6.711, Median: 6.80

    • Economic Freedom scores show a similar distribution to Human Freedom, with most countries exceeding 5.98 (1st Quartile).

  • Personal Freedom:

    • Minimum: 2.08, Maximum: 9.69

    • Mean: 7.323, Median: 7.53

    • Scores for Personal Freedom are slightly higher on average, with the top quartile exceeding 8.66.

These statistics indicate considerable variation in freedom levels across countries, with most falling within a mid-to-high range. This provides a foundation for further analysis of regional and variable-specific trends.

Distribution of Countries by Region

table(hfi_data$region)

      Caucasus & Central Asia                     East Asia 
                          103                           128 
               Eastern Europe Latin America & the Caribbean 
                          451                           562 
   Middle East & North Africa                 North America 
                          356                            44 
                      Oceania                    South Asia 
                           88                           320 
           Sub-Saharan Africa                Western Europe 
                          832                           396 

The table provides a count of countries in the dataset across each region, highlighting regional representation:

  • Sub-Saharan Africa has the highest representation with 832 countries, reflecting a broad dataset for analysis.

  • Latin America & the Caribbean (562) and Western Europe (396) are also heavily represented, ensuring comprehensive coverage of these regions.

  • Smaller regions like Oceania (88) and North America (44) have relatively fewer entries, reflecting their smaller number of jurisdictions.

  • Caucasus & Central Asia has only 103 entries, indicating limited representation in the dataset.

This distribution overviews the diversity of the dataset while also emphasizing the need to consider representation differences during analysis.

Pearson Correlation Test: Economic Freedom vs. Personal Freedom

cor.test(hfi_data$economic_freedom, hfi_data$personal_freedom)

    Pearson's product-moment correlation

data:  hfi_data$economic_freedom and hfi_data$personal_freedom
t = 55.401, df = 3278, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6772768 0.7126471
sample estimates:
      cor 
0.6953828 

A Pearson’s product-moment correlation test was performed to evaluate the relationship between Economic Freedom and Personal Freedom. Key results include:

  • Correlation Coefficient (r): 0.695, indicating a strong positive relationship between Economic Freedom and Personal Freedom.

  • p-value: < 2.2e-16, which is highly significant, suggesting the observed correlation is not due to random chance.

  • 95% Confidence Interval: [0.677, 0.713], reinforcing confidence in the strength of the relationship.

This test supports the hypothesis that higher levels of Economic Freedom are strongly associated with greater Personal Freedom, with a statistically significant and meaningful correlation. This finding underscores the interconnectedness of these two dimensions of freedom.

Regional Distribution of Top-Quartile Freedom Scores (Chi-Square)

chisq.test(table(top_countries$region))

    Chi-squared test for given probabilities

data:  table(top_countries$region)
X-squared = 1131.4, df = 7, p-value < 2.2e-16

A Chi-Square test was conducted to examine the regional distribution of countries in the top quartile of Human Freedom Scores. Key findings include:

  • Chi-Square Statistic (X²): 1131.4

  • Degrees of Freedom (df): 7

  • p-value: < 2.2e-16

The highly significant p-value indicates that the distribution of top-quartile countries is not uniform across regions. Some regions, such as Western Europe and Eastern Europe, are significantly overrepresented in the top quartile, while others, like Sub-Saharan Africa, are underrepresented. This aligns with the hypothesis that freedom is unevenly distributed globally, reflecting regional variations in governance and institutional quality.

Conclusion

This analysis provides valuable insights into the global distribution of freedom as measured by the Human Freedom Index. The study reviews:

  1. Regional Disparities: Regions like Western Europe and North America consistently achieve higher freedom scores, while regions such as Sub-Saharan Africa and the Middle East & North Africa show lower scores and minimal representation in the top quartile.

  2. Interdependence of Freedoms: A strong positive correlation (r = 0.695) was observed between Economic Freedom and Personal Freedom, indicating that improvements in one often support gains in the other.

  3. Statistical Significance of Findings: Both the Pearson correlation test and Chi-Square test confirmed that observed relationships are highly significant and not due to random chance.

These findings underscore the importance of governance and institutional transparency in fostering personal and economic liberties. Policymakers and researchers can use this analysis to identify regions needing targeted reforms and to better understand the factors contributing to global disparities in freedom. This work also demonstrates the utility of quantitative methods in exploring complex socio-political phenomena.