Midterm (Due Sunday 2/19/2023 at 11:55 pm)

Please submit your .Rmd and .html files in Sakai. If you are working together, both people should submit the files.

60 / 60 points total

The goal of the midterm project is to showcase skills that you have learned in class so far. The midterm is open note, but if you use someone else’s code, you must attribute them.


Define Your Research Question (10 points)

Define your research question below. What about the data interests you? What is a specific question you want to find out about the data?

What topic did I choose?
When looking up data folders in Tidy Tuesday, I was interested in the topic of US Drought because the issue of climate change has always been on my mind. We had wildfires almost every late summer through fall and snowstorms last winter in Oregon.

What do I want to know?
Thus, I want to know how severe the drought was in the United States continents, by 52 States, and by climate regions in the past. Plus, it would be interesting to compare drought severity levels over a long-term period in the Overall States and specific geographical areas.

Research questions!
So, I have two points to look at in this study.

1. Do we have unusually severe drought in the United States over the past 23 years?
2. Does the West region in the United States, including Oregon, have unusually severe drought compared to the past?

The data I used.
I extracted the data from the US Drought Monitor site rather than using the data in the Tidy Tuesday in ordrt to set the specific time period and regions. The data source is, U.S.Drought Monitor. To export the data from the website, I set the time from Feb.10.2000 to Feb.10.2023(for 23 years), setting “States” as a spatial scale.

Given your question, what is your expectation about the data?

1. I expect the severity of the drought has been worse in the United States on average for the past 23 years. Still, there may be differences depending on geographical regions.
2. I also expect the data to show a recent unusually severe drought in the West region.

Loading the Data (10 points)

Load the data below and use dplyr::glimpse() or skimr::skim() on the data. You should upload the data file into the data directory.

# load the data
drought<- read_csv("data/drought_by_state.csv")
## Rows: 62452 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): StateAbbreviation
## dbl  (8): MapDate, None, D0, D1, D2, D3, D4, StatisticFormatID
## date (2): ValidStart, ValidEnd
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# check
glimpse(drought)
## Rows: 62,452
## Columns: 11
## $ MapDate           <dbl> 20230207, 20230131, 20230124, 20230117, 20230110, 20…
## $ StateAbbreviation <chr> "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK"…
## $ None              <dbl> 100.00, 100.00, 100.00, 100.00, 100.00, 100.00, 100.…
## $ D0                <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00…
## $ D1                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D2                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D3                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D4                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ ValidStart        <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 202…
## $ ValidEnd          <date> 2023-02-13, 2023-02-06, 2023-01-30, 2023-01-23, 202…
## $ StatisticFormatID <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…


If there are any quirks that you have to deal with NA coded as something else, or it is multiple tables, please make some notes here about what you need to do before you start transforming the data in the next section.

In the original dataset, there are 62452 rows and 11 columns.

  • ValidStart and ValidEnd are about the start and end dates. Also, MapDate takes the same value in the ValidStart but as a numeric value. I’ll change MapDate as date type.

  • StateAbbreviation indicates 52 States in the United States with two letters.
    For simplicity, I’ll change the column name to state. And then, I’ll make a new column, Region to classify States in the “West” Region or not because my second research interest focus on the “West” Region.
    About the entire US climate region, we can refer to the explanation below from National Centers for Environmental Information.

  • Region can be classified as six climate regions in the United States continent, but here I’ll classify this into two categories as “West” or “Non West”.

    • West: Arizona, California, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington. (9 states)

    • Non West:

      • High Plains:Colorado, Kansas, Nebraska, North Dakota, South Dakota, and Wyoming (6 states)
      • Midwest: Illinois, Indiana, Iowa, Kentucky, Michigan, Minnesota, Missouri, Ohio, and Wisconsin (9 states)
      • Northeast: Maine, New Hampshire, Vermont, Massachusetts, Connecticut, Rhode Island, Delaware, New York, Pennsylvania, New Jersey, Maryland, and West Virginia and Washington D.C (13 states)
      • South: Arkansas, Tennessee, Texas, Louisiana, Mississippi, and Oklahoma. (6 states)
      • Southeast: Alabama, Florida, Georgia, North Carolina, South Carolina, Virginia, Puerto Rico and the U.S. Virgin Islands. (8 states)


  • None, D0, D1, D2, D3 and D4 columns take percent(%) values which means what percent of the region are in each drought category. Therefore, the sum of these values in each observation should be 100(%). I will make a new column to check whether the sum is equal to 100.
    • None : No drought
    • D0 : Abnormally Dry
    • D1 : Moderate Drought
    • D2 : Severe Drought
    • D3 : Extreme Drought
    • D4 : Exceptional Drought


  • Check with NAs
    We don’t have any missing value in the original dataset.
# check NA's
visdat::vis_miss(drought)

  • Drought Severity and Coverage Index (DSCI)
    The Drought Severity and Coverage Index (DSCI) is a weighted sum of the proportion of each area in each level of drought, summarizing the extent and severity of drought with a single number each week on a scale from 0 (no drought) to 500 (all of the area in the worst category of drought). It can be computed as below;

    0(None)+1(D0) + 2(D1) + 3(D2) + 4(D3) + 5(D4) = DSCI

    Since there is no DSCI column in original dataset, I will make a new column, DSCI for analysis.

[Image Reference] U.S.Drought Monitor.


Make sure your data types are correct!

I think there is no problem in data types.


Transforming the data (15 points)

If the data needs to be transformed in any way (values recoded, pivoted, etc), do it here. Examples include transforming a continuous variable into a categorical using case_when(), etc.

# mutate
drought_transformed <- drought %>% 
  mutate(
  Date = ValidStart,
  Year = year(Date),              
  Year = as.character(Year),         # extract `Year` and make a character value 
  Month = month(Date),
  Month = as.character(Month),       # extract `Year` and make a character value
  state = StateAbbreviation,         # renaming a column
  Region = case_when(
    state %in% c("AZ","CA","ID","MT","NV","NM","OR","UT","WA") ~ "West",     # focus on West region
    TRUE ~                                                       "Non West"),
  Sum = None+D0+D1+D2+D3+D4,         # check `Sum` is 100(%) 
  DSCI = 0*None+1*D0+2*D1+3*D2+4*D3+5*D4) %>%        # calculate the `DSCI` for each observation  
  select(Date:Region,None:D4,Sum,DSCI)   


# check
head(drought_transformed)
## # A tibble: 6 × 13
##   Date       Year  Month state Region   None    D0    D1    D2    D3    D4   Sum
##   <date>     <chr> <chr> <chr> <chr>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2023-02-07 2023  2     AK    Non We…   100     0     0     0     0     0   100
## 2 2023-01-31 2023  1     AK    Non We…   100     0     0     0     0     0   100
## 3 2023-01-24 2023  1     AK    Non We…   100     0     0     0     0     0   100
## 4 2023-01-17 2023  1     AK    Non We…   100     0     0     0     0     0   100
## 5 2023-01-10 2023  1     AK    Non We…   100     0     0     0     0     0   100
## 6 2023-01-03 2023  1     AK    Non We…   100     0     0     0     0     0   100
## # … with 1 more variable: DSCI <dbl>

Bonus points (5 points) for datasets that require merging of tables, but only if you reason through whether you should use left_join, inner_join, or right_join on these tables. No credit will be provided if you don’t.

# West region only
west <- drought_transformed %>% 
  filter(Region == "West")

# check
west %>% tabyl(Region) 
##  Region     n percent
##    West 10809       1
# average DSCI by Year
avg_DSCI_year_west <- west %>% 
  select(Year:Region,DSCI) %>% 
  group_by(state, Year) %>% 
  summarise(
    avg.DSCI.year = mean(DSCI))
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
# average DSCI by Year and Month
avg_DSCI_month_west <- west %>% 
  select(Year:Region,DSCI) %>% 
  group_by(state, Year, Month) %>% 
  summarise(
    avg.DSCI.month = mean(DSCI))
## `summarise()` has grouped output by 'state', 'Year'. You can override using the
## `.groups` argument.
# check
glimpse(avg_DSCI_year_west);glimpse(avg_DSCI_month_west)
## Rows: 216
## Columns: 3
## Groups: state [9]
## $ state         <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "A…
## $ Year          <chr> "2000", "2001", "2002", "2003", "2004", "2005", "2006", …
## $ avg.DSCI.year <dbl> 77.734468, 2.011538, 306.964340, 315.805769, 306.002115,…
## Rows: 2,493
## Columns: 4
## Groups: state, Year [216]
## $ state          <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Year           <chr> "2000", "2000", "2000", "2000", "2000", "2000", "2000",…
## $ Month          <chr> "10", "11", "12", "2", "3", "4", "5", "6", "7", "8", "9…
## $ avg.DSCI.month <dbl> 86.1760, 0.0000, 0.0000, 122.6400, 65.6100, 50.0325, 10…
# left_join with west and avg_DSCI_year_west
join_west<- left_join(west, avg_DSCI_year_west, by = c("state","Year"))

# left_join with avg_DSCI_month_west
join_west <- left_join(join_west, avg_DSCI_month_west, by = c("state","Year","Month"))

# check
glimpse(join_west)
## Rows: 10,809
## Columns: 15
## $ Date           <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 2023-0…
## $ Year           <chr> "2023", "2023", "2023", "2023", "2023", "2023", "2022",…
## $ Month          <chr> "2", "1", "1", "1", "1", "1", "12", "12", "12", "12", "…
## $ state          <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Region         <chr> "West", "West", "West", "West", "West", "West", "West",…
## $ None           <dbl> 42.37, 42.37, 42.37, 42.20, 27.66, 12.40, 12.40, 12.40,…
## $ D0             <dbl> 36.30, 36.30, 36.30, 36.47, 40.65, 48.66, 48.64, 48.64,…
## $ D1             <dbl> 20.14, 20.18, 20.18, 20.18, 30.53, 31.09, 31.12, 31.12,…
## $ D2             <dbl> 1.19, 1.15, 1.15, 1.15, 1.16, 7.85, 7.85, 7.85, 7.85, 1…
## $ D3             <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0…
## $ D4             <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Sum            <dbl> 100.00, 100.00, 100.00, 100.00, 100.00, 100.00, 100.01,…
## $ DSCI           <dbl> 80.15, 80.11, 80.11, 80.28, 105.19, 134.39, 134.43, 134…
## $ avg.DSCI.year  <dbl> 93.37167, 93.37167, 93.37167, 93.37167, 93.37167, 93.37…
## $ avg.DSCI.month <dbl> 80.1500, 96.0160, 96.0160, 96.0160, 96.0160, 96.0160, 1…


Show your transformed table here. Use tools such as glimpse(), skim() or head() to illustrate your point.

# 1st research question data (overall US)
skim(drought_transformed)
Data summary
Name drought_transformed
Number of rows 62452
Number of columns 13
_______________________
Column type frequency:
character 4
Date 1
numeric 8
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Year 0 1 4 4 0 24 0
Month 0 1 1 2 0 12 0
state 0 1 2 2 0 52 0
Region 0 1 4 8 0 2 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
Date 0 1 2000-02-08 2023-02-07 2011-08-09 1201

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
None 0 1 61.30 38.19 0.00 23.93 72.86 100.00 100.00 ▅▂▂▂▇
D0 0 1 16.67 20.01 0.00 0.00 10.04 26.64 100.00 ▇▂▁▁▁
D1 0 1 10.40 16.34 0.00 0.00 0.48 16.32 100.00 ▇▂▁▁▁
D2 0 1 6.88 14.17 0.00 0.00 0.00 6.68 100.00 ▇▁▁▁▁
D3 0 1 3.69 11.00 0.00 0.00 0.00 0.00 100.00 ▇▁▁▁▁
D4 0 1 1.07 5.87 0.00 0.00 0.00 0.00 87.99 ▇▁▁▁▁
Sum 0 1 100.00 0.01 99.98 100.00 100.00 100.00 101.64 ▇▁▁▁▁
DSCI 0 1 78.20 101.85 0.00 0.00 31.66 122.14 484.14 ▇▂▁▁▁
# 2nd research question data (West only)
skim(join_west)
Data summary
Name join_west
Number of rows 10809
Number of columns 15
_______________________
Column type frequency:
character 4
Date 1
numeric 10
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Year 0 1 4 4 0 24 0
Month 0 1 1 2 0 12 0
state 0 1 2 2 0 9 0
Region 0 1 4 4 0 1 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
Date 0 1 2000-02-08 2023-02-07 2011-08-09 1201

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
None 0 1 35.81 36.55 0.00 0.03 22.26 68.37 100.00 ▇▂▂▂▃
D0 0 1 18.60 18.76 0.00 2.15 14.24 28.91 100.00 ▇▃▁▁▁
D1 0 1 17.29 17.66 0.00 1.74 12.78 27.33 100.00 ▇▃▁▁▁
D2 0 1 15.91 19.08 0.00 0.00 9.02 26.71 100.00 ▇▂▁▁▁
D3 0 1 9.59 16.75 0.00 0.00 0.00 13.95 100.00 ▇▂▁▁▁
D4 0 1 2.79 9.21 0.00 0.00 0.00 0.00 76.81 ▇▁▁▁▁
Sum 0 1 100.00 0.01 99.98 100.00 100.00 100.00 100.02 ▁▂▇▂▁
DSCI 0 1 153.26 124.54 0.00 38.92 129.93 251.06 469.63 ▇▅▃▃▁
avg.DSCI.year 0 1 153.26 110.88 1.43 58.64 127.18 244.34 431.46 ▇▃▅▃▁
avg.DSCI.month 0 1 153.26 123.95 0.00 40.49 129.03 250.55 466.13 ▇▅▃▃▁


Are the values what you expected for the variables? Why or Why not?

  • I expected that the sum of None,D0,D1,D2,D3 and D4 columns will be 100(%) because those indicate the proportion of the Climate Region classified as being in one of six levels of drought.
    • In the drought_transformed data, the Sum column consists values of 100s, and ranges from 99.98 to 101.64, with errors might be caused by measuring or rounding.
    • In the join_west data, the Sum column consists values of 100s, and ranges from 99.98 to 100.02, which is fairly close to 100.
  • The range of DSCI should be from 0 to 500. And we can check this is true for two datasets.


Visualizing and Summarizing the Data (15 points)

Use group_by() and summarize() to make a summary of the data here. The summary should be relevant to your research question

To answer research questions, I grouped data by state, Year, and Month in order, and then summarized the average DSCI values.

# fist summary
summary_state <- drought_transformed %>% 
  select(Date:state,DSCI) %>% 
  group_by(state, Year) %>% 
  summarise(
    avg.DSCI = mean(DSCI)) %>% 
  mutate(
    Year = as.numeric(Year)) 
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
# check
glimpse(summary_state)
## Rows: 1,248
## Columns: 3
## Groups: state [52]
## $ state    <chr> "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "…
## $ Year     <dbl> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2…
## $ avg.DSCI <dbl> 2.4893617, 0.2319231, 14.3100000, 13.5015385, 26.7409615, 4.0…


# second summary ; already summarised() before
summary_west <- join_west %>% 
  select(Date:Region,DSCI:avg.DSCI.month) %>% 
  mutate(
    Year = as.numeric(Year),
    Month = as.numeric(Month))

# check
glimpse(summary_west)
## Rows: 10,809
## Columns: 8
## $ Date           <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 2023-0…
## $ Year           <dbl> 2023, 2023, 2023, 2023, 2023, 2023, 2022, 2022, 2022, 2…
## $ Month          <dbl> 2, 1, 1, 1, 1, 1, 12, 12, 12, 12, 11, 11, 11, 11, 11, 1…
## $ state          <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Region         <chr> "West", "West", "West", "West", "West", "West", "West",…
## $ DSCI           <dbl> 80.15, 80.11, 80.11, 80.28, 105.19, 134.39, 134.43, 134…
## $ avg.DSCI.year  <dbl> 93.37167, 93.37167, 93.37167, 93.37167, 93.37167, 93.37…
## $ avg.DSCI.month <dbl> 80.1500, 96.0160, 96.0160, 96.0160, 96.0160, 96.0160, 1…


What are your findings about the summary? Are they what you expected?

Because my research questions are about change in drought severity by time for 23 years, I need to visualize the data to find answers.


Make at least two plots that help you answer your question on the transformed or summarized data. Use scales and/or labels to make each plot informative.

The first question was,
1. Do we have unusually severe drought in the United States over the past 23 years?

# Reference for following r codes: BSTA512 lecture notes by Dr.Meike

# plot for 2000
usmap_2000 <- plot_usmap(
  data = summary_state %>% filter(Year == 2000),
  regions = "state",
  values = "avg.DSCI")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2000")+
   theme(legend.position = "top")


# plot for 2007
usmap_2007 <- plot_usmap(
  data = summary_state %>% filter(Year == 2007),
  regions = "state",
  values = "avg.DSCI")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2007")+
   theme(legend.position = "top")


# plot for 2015
usmap_2015 <- plot_usmap(
  data = summary_state %>% filter(Year == 2015),
  regions = "state",
  values = "avg.DSCI")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2015")+
   theme(legend.position = "top")


# plot for 2022
usmap_2022 <- plot_usmap(
  data = summary_state %>% filter(Year == 2022),
  regions = "state",
  values = "avg.DSCI")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2022")+
   theme(legend.position = "top")
  


# arrange plots
gridExtra::grid.arrange(usmap_2000,usmap_2007,usmap_2015,usmap_2022,
                        ncol = 4)

To check the change in DSCI in time series data, I selected four specific years with eight years apart from the original data years(2000~2023), 2000, 2007, 2015, and 2022(the most recent year with complete records). The first research question focused on the United States continents, so plotting average DSCI values in a year by States into the U.S. map would effectively show the geographical differences over 23 years.
As per plots, the number and density of redness in overall States have been increasing clearly. That means we can check that the severity of the drought has worsened in North America. And clear contrasts have been shown in these heat maps between East and West. Thus, this data supports the article says that much of the western United States has been experiencing a historic and unrelenting drought, the worst in the region in centuries, from CNN.Reference Article

The second question was,
2. Does the West region in the United States, including Oregon, have unusually severe drought compared to the past?

# plot for 2000
westmap_2000 <- plot_usmap(
  include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
  data = summary_west %>% filter(Year == 2000),
  regions = "state",
  values = "avg.DSCI.year")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2000")+
   theme(legend.position = "top")


# plot for 2007
westmap_2007 <- plot_usmap(
  include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
  data = summary_west %>% filter(Year == 2007),
  regions = "state",
  values = "avg.DSCI.year")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2007")+
   theme(legend.position = "top")


# plot for 2015
westmap_2015 <- plot_usmap(
  include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
  data = summary_west %>% filter(Year == 2015),
  regions = "state",
  values = "avg.DSCI.year")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2015")+
   theme(legend.position = "top")


# plot for 2022
westmap_2022 <- plot_usmap(
  include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
  data = summary_west %>% filter(Year == 2022),
  regions = "state",
  values = "avg.DSCI.year")+
  scale_fill_gradient2(
    low = "white",
    high = "darkred",
    name = "DSCI in 2022")+
   theme(legend.position = "top")
  


# arrange plots
gridExtra::grid.arrange(westmap_2000,westmap_2007,westmap_2015,westmap_2022,
                        ncol = 4)

Based on the output of the first question, narrowing down the geographical region is necessary to find the answer to the second question. So, it is reasonable to focus on the West Region only. I created four plots with the same filtered years, 2000, 2007, 2015, and 2022(the most recent year with complete records) based on the map of the West Region in the U.S. (Arizona, California, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington) This time we can check that all nine States have become dense in redness, worsening over time. And the severity of drought in California has already started at least from 2007. Plus, Oregon has been showing an increasing redness density over time, which also supports the evidence of climate change in this State.

# subset for three States, OR, CA, and WA
summary_west_3 <- summary_west %>% 
  filter(state %in% c("OR","CA","WA")) %>% 
  mutate(state = as.factor(state))


# Compare plot
ggplot(summary_west_3,
       aes(x = Date ,y = avg.DSCI.month, col = state, group = state))+
  geom_point(alpha=0.4)+
  geom_line()+
  theme_bw()+
  scale_color_viridis_d(name ="States")+
  facet_wrap(~state)+
  labs(
    title = "Average Monthly DSCI Trend over time by States",
    y = "Average Monthly DSCI",
    X = "Time")

Lastly, comparing drought trends in three States, Oregon, California, and Washington, would be interesting from my interest. So, using a subset of the data, I created a plot for each State showing monthly average DSCI trends over 23 years. In the output above, three States share the same increasing trends over time. Still, the increasing trend in monthly average DSCI is most considerable in California(CA) and most minor in Washington(WA). Oregon(OR) is in the middle.


Final Summary (10 points)

Summarize your research question and findings below.

Answers for research questions are above with plots.


Are your findings what you expected? Why or Why not?

In summary, I found an increasing trend in average yearly DSCI values in the United States continents over 23 years. That was clear in the US heat map with dense redness, especially in the West Region. This result was what I expected before starting this analysis.
The second research question focused on the drought in the West Region with the same time frame. The output supported the evidence of severe drought in the West Region progressing over the past 20 years, the same conclusion as I expected.
Finally, I chose three States in the West Region based on personal interest to compare drought-level trends. Based on these time-series plots, I could say there is an increasing tendency over time for all three States with different magnitudes. However, it still needs appropriate statistical analysis of time-series data to support this conclusion.