Please submit your .Rmd
and .html
files in
Sakai. If you are working together, both people should submit the
files.
The goal of the midterm project is to showcase skills that you have learned in class so far. The midterm is open note, but if you use someone else’s code, you must attribute them.
Define your research question below. What about the data interests you? What is a specific question you want to find out about the data?
What topic did I choose?
When looking up data folders in Tidy Tuesday, I was interested in the
topic of US Drought
because the issue of climate change has
always been on my mind. We had wildfires almost every late summer
through fall and snowstorms last winter in Oregon.
What do I want to
know?
Thus, I want to know how severe the drought was in the United
States continents, by 52 States, and by climate regions in the past.
Plus, it would be interesting to compare drought severity levels over a
long-term period in the Overall States and specific geographical
areas.
Research questions!
So, I have two points to look at in this study.
1. Do we have unusually severe drought in the United States
over the past 23 years?
2. Does the West region in the United States, including Oregon, have
unusually severe drought compared to the past?
The data I used.
I extracted the data from the US Drought Monitor site rather
than using the data in the Tidy Tuesday in ordrt to set the specific
time period and regions. The data source is, U.S.Drought Monitor.
To export the data from the website, I set the time from Feb.10.2000 to
Feb.10.2023(for 23 years), setting “States” as a spatial scale.
Given your question, what is your expectation about the data?
1. I expect the severity of the drought has been
worse in the United States on average for the past 23 years. Still,
there may be differences depending on geographical regions.
2. I also expect the data to show a recent unusually
severe drought in the West region.
Load the data below and use
dplyr::glimpse()
orskimr::skim()
on the data. You should upload the data file into thedata
directory.
# load the data
drought<- read_csv("data/drought_by_state.csv")
## Rows: 62452 Columns: 11
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): StateAbbreviation
## dbl (8): MapDate, None, D0, D1, D2, D3, D4, StatisticFormatID
## date (2): ValidStart, ValidEnd
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# check
glimpse(drought)
## Rows: 62,452
## Columns: 11
## $ MapDate <dbl> 20230207, 20230131, 20230124, 20230117, 20230110, 20…
## $ StateAbbreviation <chr> "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK"…
## $ None <dbl> 100.00, 100.00, 100.00, 100.00, 100.00, 100.00, 100.…
## $ D0 <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00…
## $ D1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D2 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D3 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ D4 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ ValidStart <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 202…
## $ ValidEnd <date> 2023-02-13, 2023-02-06, 2023-01-30, 2023-01-23, 202…
## $ StatisticFormatID <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
If there are any quirks that you have to deal with
NA
coded as something else, or it is multiple tables, please make some notes here about what you need to do before you start transforming the data in the next section.
In the original dataset, there are 62452 rows and 11 columns.
ValidStart
and ValidEnd
are about the
start and end dates. Also, MapDate
takes the same value in
the ValidStart
but as a numeric value. I’ll change
MapDate
as date type.
StateAbbreviation
indicates 52 States in the United
States with two letters.
For simplicity, I’ll change the column name to state
. And
then, I’ll make a new column, Region
to classify States in
the “West” Region or not because my second research interest focus on
the “West” Region.
About the entire US climate region, we can refer to the explanation
below from National
Centers for Environmental Information.
Region
can be classified as six climate regions in
the United States continent, but here I’ll classify this into two
categories as “West” or “Non West”.
West: Arizona, California, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington. (9 states)
Non West:
None
, D0
, D1
,
D2
, D3
and D4
columns take
percent(%) values which means what percent of the region are in each
drought category. Therefore, the sum of these values in each observation
should be 100(%). I will make a new column to check whether the sum is
equal to 100.
None
: No droughtD0
: Abnormally DryD1
: Moderate DroughtD2
: Severe DroughtD3
: Extreme DroughtD4
: Exceptional DroughtNA
s# check NA's
visdat::vis_miss(drought)
Drought Severity and Coverage Index (DSCI)
The Drought Severity and Coverage Index (DSCI) is a weighted sum of the
proportion of each area in each level of drought, summarizing the extent
and severity of drought with a single number each week on a scale from 0
(no drought) to 500 (all of the area in the worst category of drought).
It can be computed as below;
0(None)+1(D0) + 2(D1) + 3(D2) + 4(D3) + 5(D4) = DSCI
Since there is no DSCI column in original dataset, I will make a new
column, DSCI
for analysis.
[Image Reference] U.S.Drought Monitor.
Make sure your data types are correct!
I think there is no problem in data types.
If the data needs to be transformed in any way (values recoded, pivoted, etc), do it here. Examples include transforming a continuous variable into a categorical using
case_when()
, etc.
# mutate
drought_transformed <- drought %>%
mutate(
Date = ValidStart,
Year = year(Date),
Year = as.character(Year), # extract `Year` and make a character value
Month = month(Date),
Month = as.character(Month), # extract `Year` and make a character value
state = StateAbbreviation, # renaming a column
Region = case_when(
state %in% c("AZ","CA","ID","MT","NV","NM","OR","UT","WA") ~ "West", # focus on West region
TRUE ~ "Non West"),
Sum = None+D0+D1+D2+D3+D4, # check `Sum` is 100(%)
DSCI = 0*None+1*D0+2*D1+3*D2+4*D3+5*D4) %>% # calculate the `DSCI` for each observation
select(Date:Region,None:D4,Sum,DSCI)
# check
head(drought_transformed)
## # A tibble: 6 × 13
## Date Year Month state Region None D0 D1 D2 D3 D4 Sum
## <date> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2023-02-07 2023 2 AK Non We… 100 0 0 0 0 0 100
## 2 2023-01-31 2023 1 AK Non We… 100 0 0 0 0 0 100
## 3 2023-01-24 2023 1 AK Non We… 100 0 0 0 0 0 100
## 4 2023-01-17 2023 1 AK Non We… 100 0 0 0 0 0 100
## 5 2023-01-10 2023 1 AK Non We… 100 0 0 0 0 0 100
## 6 2023-01-03 2023 1 AK Non We… 100 0 0 0 0 0 100
## # … with 1 more variable: DSCI <dbl>
Bonus points (5 points) for datasets that require merging of tables, but only if you reason through whether you should use
left_join
,inner_join
, orright_join
on these tables. No credit will be provided if you don’t.
# West region only
west <- drought_transformed %>%
filter(Region == "West")
# check
west %>% tabyl(Region)
## Region n percent
## West 10809 1
# average DSCI by Year
avg_DSCI_year_west <- west %>%
select(Year:Region,DSCI) %>%
group_by(state, Year) %>%
summarise(
avg.DSCI.year = mean(DSCI))
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
# average DSCI by Year and Month
avg_DSCI_month_west <- west %>%
select(Year:Region,DSCI) %>%
group_by(state, Year, Month) %>%
summarise(
avg.DSCI.month = mean(DSCI))
## `summarise()` has grouped output by 'state', 'Year'. You can override using the
## `.groups` argument.
# check
glimpse(avg_DSCI_year_west);glimpse(avg_DSCI_month_west)
## Rows: 216
## Columns: 3
## Groups: state [9]
## $ state <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "A…
## $ Year <chr> "2000", "2001", "2002", "2003", "2004", "2005", "2006", …
## $ avg.DSCI.year <dbl> 77.734468, 2.011538, 306.964340, 315.805769, 306.002115,…
## Rows: 2,493
## Columns: 4
## Groups: state, Year [216]
## $ state <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Year <chr> "2000", "2000", "2000", "2000", "2000", "2000", "2000",…
## $ Month <chr> "10", "11", "12", "2", "3", "4", "5", "6", "7", "8", "9…
## $ avg.DSCI.month <dbl> 86.1760, 0.0000, 0.0000, 122.6400, 65.6100, 50.0325, 10…
# left_join with west and avg_DSCI_year_west
join_west<- left_join(west, avg_DSCI_year_west, by = c("state","Year"))
# left_join with avg_DSCI_month_west
join_west <- left_join(join_west, avg_DSCI_month_west, by = c("state","Year","Month"))
# check
glimpse(join_west)
## Rows: 10,809
## Columns: 15
## $ Date <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 2023-0…
## $ Year <chr> "2023", "2023", "2023", "2023", "2023", "2023", "2022",…
## $ Month <chr> "2", "1", "1", "1", "1", "1", "12", "12", "12", "12", "…
## $ state <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Region <chr> "West", "West", "West", "West", "West", "West", "West",…
## $ None <dbl> 42.37, 42.37, 42.37, 42.20, 27.66, 12.40, 12.40, 12.40,…
## $ D0 <dbl> 36.30, 36.30, 36.30, 36.47, 40.65, 48.66, 48.64, 48.64,…
## $ D1 <dbl> 20.14, 20.18, 20.18, 20.18, 30.53, 31.09, 31.12, 31.12,…
## $ D2 <dbl> 1.19, 1.15, 1.15, 1.15, 1.16, 7.85, 7.85, 7.85, 7.85, 1…
## $ D3 <dbl> 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0…
## $ D4 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Sum <dbl> 100.00, 100.00, 100.00, 100.00, 100.00, 100.00, 100.01,…
## $ DSCI <dbl> 80.15, 80.11, 80.11, 80.28, 105.19, 134.39, 134.43, 134…
## $ avg.DSCI.year <dbl> 93.37167, 93.37167, 93.37167, 93.37167, 93.37167, 93.37…
## $ avg.DSCI.month <dbl> 80.1500, 96.0160, 96.0160, 96.0160, 96.0160, 96.0160, 1…
Show your transformed table here. Use tools such as
glimpse()
,skim()
orhead()
to illustrate your point.
# 1st research question data (overall US)
skim(drought_transformed)
Name | drought_transformed |
Number of rows | 62452 |
Number of columns | 13 |
_______________________ | |
Column type frequency: | |
character | 4 |
Date | 1 |
numeric | 8 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Year | 0 | 1 | 4 | 4 | 0 | 24 | 0 |
Month | 0 | 1 | 1 | 2 | 0 | 12 | 0 |
state | 0 | 1 | 2 | 2 | 0 | 52 | 0 |
Region | 0 | 1 | 4 | 8 | 0 | 2 | 0 |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
---|---|---|---|---|---|---|
Date | 0 | 1 | 2000-02-08 | 2023-02-07 | 2011-08-09 | 1201 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
None | 0 | 1 | 61.30 | 38.19 | 0.00 | 23.93 | 72.86 | 100.00 | 100.00 | ▅▂▂▂▇ |
D0 | 0 | 1 | 16.67 | 20.01 | 0.00 | 0.00 | 10.04 | 26.64 | 100.00 | ▇▂▁▁▁ |
D1 | 0 | 1 | 10.40 | 16.34 | 0.00 | 0.00 | 0.48 | 16.32 | 100.00 | ▇▂▁▁▁ |
D2 | 0 | 1 | 6.88 | 14.17 | 0.00 | 0.00 | 0.00 | 6.68 | 100.00 | ▇▁▁▁▁ |
D3 | 0 | 1 | 3.69 | 11.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100.00 | ▇▁▁▁▁ |
D4 | 0 | 1 | 1.07 | 5.87 | 0.00 | 0.00 | 0.00 | 0.00 | 87.99 | ▇▁▁▁▁ |
Sum | 0 | 1 | 100.00 | 0.01 | 99.98 | 100.00 | 100.00 | 100.00 | 101.64 | ▇▁▁▁▁ |
DSCI | 0 | 1 | 78.20 | 101.85 | 0.00 | 0.00 | 31.66 | 122.14 | 484.14 | ▇▂▁▁▁ |
# 2nd research question data (West only)
skim(join_west)
Name | join_west |
Number of rows | 10809 |
Number of columns | 15 |
_______________________ | |
Column type frequency: | |
character | 4 |
Date | 1 |
numeric | 10 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
Year | 0 | 1 | 4 | 4 | 0 | 24 | 0 |
Month | 0 | 1 | 1 | 2 | 0 | 12 | 0 |
state | 0 | 1 | 2 | 2 | 0 | 9 | 0 |
Region | 0 | 1 | 4 | 4 | 0 | 1 | 0 |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
---|---|---|---|---|---|---|
Date | 0 | 1 | 2000-02-08 | 2023-02-07 | 2011-08-09 | 1201 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
None | 0 | 1 | 35.81 | 36.55 | 0.00 | 0.03 | 22.26 | 68.37 | 100.00 | ▇▂▂▂▃ |
D0 | 0 | 1 | 18.60 | 18.76 | 0.00 | 2.15 | 14.24 | 28.91 | 100.00 | ▇▃▁▁▁ |
D1 | 0 | 1 | 17.29 | 17.66 | 0.00 | 1.74 | 12.78 | 27.33 | 100.00 | ▇▃▁▁▁ |
D2 | 0 | 1 | 15.91 | 19.08 | 0.00 | 0.00 | 9.02 | 26.71 | 100.00 | ▇▂▁▁▁ |
D3 | 0 | 1 | 9.59 | 16.75 | 0.00 | 0.00 | 0.00 | 13.95 | 100.00 | ▇▂▁▁▁ |
D4 | 0 | 1 | 2.79 | 9.21 | 0.00 | 0.00 | 0.00 | 0.00 | 76.81 | ▇▁▁▁▁ |
Sum | 0 | 1 | 100.00 | 0.01 | 99.98 | 100.00 | 100.00 | 100.00 | 100.02 | ▁▂▇▂▁ |
DSCI | 0 | 1 | 153.26 | 124.54 | 0.00 | 38.92 | 129.93 | 251.06 | 469.63 | ▇▅▃▃▁ |
avg.DSCI.year | 0 | 1 | 153.26 | 110.88 | 1.43 | 58.64 | 127.18 | 244.34 | 431.46 | ▇▃▅▃▁ |
avg.DSCI.month | 0 | 1 | 153.26 | 123.95 | 0.00 | 40.49 | 129.03 | 250.55 | 466.13 | ▇▅▃▃▁ |
Are the values what you expected for the variables? Why or Why not?
None
,D0
,D1
,D2
,D3
and D4
columns will be 100(%) because those indicate the
proportion of the Climate Region classified as being in one of six
levels of drought.
drought_transformed
data, the Sum
column consists values of 100s, and ranges from 99.98 to 101.64, with
errors might be caused by measuring or rounding.join_west
data, the Sum
column
consists values of 100s, and ranges from 99.98 to 100.02, which is
fairly close to 100.DSCI
should be from 0 to 500. And we can
check this is true for two datasets.Use
group_by()
andsummarize()
to make a summary of the data here. The summary should be relevant to your research question
To answer research questions, I grouped data by state
,
Year
, and Month
in order, and then summarized
the average DSCI
values.
# fist summary
summary_state <- drought_transformed %>%
select(Date:state,DSCI) %>%
group_by(state, Year) %>%
summarise(
avg.DSCI = mean(DSCI)) %>%
mutate(
Year = as.numeric(Year))
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
# check
glimpse(summary_state)
## Rows: 1,248
## Columns: 3
## Groups: state [52]
## $ state <chr> "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "…
## $ Year <dbl> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2…
## $ avg.DSCI <dbl> 2.4893617, 0.2319231, 14.3100000, 13.5015385, 26.7409615, 4.0…
# second summary ; already summarised() before
summary_west <- join_west %>%
select(Date:Region,DSCI:avg.DSCI.month) %>%
mutate(
Year = as.numeric(Year),
Month = as.numeric(Month))
# check
glimpse(summary_west)
## Rows: 10,809
## Columns: 8
## $ Date <date> 2023-02-07, 2023-01-31, 2023-01-24, 2023-01-17, 2023-0…
## $ Year <dbl> 2023, 2023, 2023, 2023, 2023, 2023, 2022, 2022, 2022, 2…
## $ Month <dbl> 2, 1, 1, 1, 1, 1, 12, 12, 12, 12, 11, 11, 11, 11, 11, 1…
## $ state <chr> "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "…
## $ Region <chr> "West", "West", "West", "West", "West", "West", "West",…
## $ DSCI <dbl> 80.15, 80.11, 80.11, 80.28, 105.19, 134.39, 134.43, 134…
## $ avg.DSCI.year <dbl> 93.37167, 93.37167, 93.37167, 93.37167, 93.37167, 93.37…
## $ avg.DSCI.month <dbl> 80.1500, 96.0160, 96.0160, 96.0160, 96.0160, 96.0160, 1…
What are your findings about the summary? Are they what you expected?
Because my research questions are about change in drought severity by time for 23 years, I need to visualize the data to find answers.
Make at least two plots that help you answer your question on the transformed or summarized data. Use scales and/or labels to make each plot informative.
The first question was,
1. Do we have unusually severe drought in the United States over
the past 23 years?
# Reference for following r codes: BSTA512 lecture notes by Dr.Meike
# plot for 2000
usmap_2000 <- plot_usmap(
data = summary_state %>% filter(Year == 2000),
regions = "state",
values = "avg.DSCI")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2000")+
theme(legend.position = "top")
# plot for 2007
usmap_2007 <- plot_usmap(
data = summary_state %>% filter(Year == 2007),
regions = "state",
values = "avg.DSCI")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2007")+
theme(legend.position = "top")
# plot for 2015
usmap_2015 <- plot_usmap(
data = summary_state %>% filter(Year == 2015),
regions = "state",
values = "avg.DSCI")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2015")+
theme(legend.position = "top")
# plot for 2022
usmap_2022 <- plot_usmap(
data = summary_state %>% filter(Year == 2022),
regions = "state",
values = "avg.DSCI")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2022")+
theme(legend.position = "top")
# arrange plots
gridExtra::grid.arrange(usmap_2000,usmap_2007,usmap_2015,usmap_2022,
ncol = 4)
To check the change in DSCI in time series data, I selected four
specific years with eight years apart from the original data
years(2000~2023), 2000, 2007, 2015, and 2022(the most recent year with
complete records). The first research question focused on the United
States continents, so plotting average DSCI values in a year by States
into the U.S. map would effectively show the geographical differences
over 23 years.
As per plots, the number and density of redness in overall States have
been increasing clearly. That means we can check that the severity of
the drought has worsened in North America. And clear contrasts have been
shown in these heat maps between East and West. Thus, this data supports
the article says that much of the western United States has been
experiencing a historic and unrelenting drought, the worst in the region
in centuries, from CNN.Reference
Article
The second question was,
2. Does the West region in the United States, including Oregon,
have unusually severe drought compared to the past?
# plot for 2000
westmap_2000 <- plot_usmap(
include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
data = summary_west %>% filter(Year == 2000),
regions = "state",
values = "avg.DSCI.year")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2000")+
theme(legend.position = "top")
# plot for 2007
westmap_2007 <- plot_usmap(
include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
data = summary_west %>% filter(Year == 2007),
regions = "state",
values = "avg.DSCI.year")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2007")+
theme(legend.position = "top")
# plot for 2015
westmap_2015 <- plot_usmap(
include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
data = summary_west %>% filter(Year == 2015),
regions = "state",
values = "avg.DSCI.year")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2015")+
theme(legend.position = "top")
# plot for 2022
westmap_2022 <- plot_usmap(
include = c("AZ","CA","ID","MT","NM","NV","OR","UT","WA"),
data = summary_west %>% filter(Year == 2022),
regions = "state",
values = "avg.DSCI.year")+
scale_fill_gradient2(
low = "white",
high = "darkred",
name = "DSCI in 2022")+
theme(legend.position = "top")
# arrange plots
gridExtra::grid.arrange(westmap_2000,westmap_2007,westmap_2015,westmap_2022,
ncol = 4)
Based on the output of the first question, narrowing down the geographical region is necessary to find the answer to the second question. So, it is reasonable to focus on the West Region only. I created four plots with the same filtered years, 2000, 2007, 2015, and 2022(the most recent year with complete records) based on the map of the West Region in the U.S. (Arizona, California, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington) This time we can check that all nine States have become dense in redness, worsening over time. And the severity of drought in California has already started at least from 2007. Plus, Oregon has been showing an increasing redness density over time, which also supports the evidence of climate change in this State.
# subset for three States, OR, CA, and WA
summary_west_3 <- summary_west %>%
filter(state %in% c("OR","CA","WA")) %>%
mutate(state = as.factor(state))
# Compare plot
ggplot(summary_west_3,
aes(x = Date ,y = avg.DSCI.month, col = state, group = state))+
geom_point(alpha=0.4)+
geom_line()+
theme_bw()+
scale_color_viridis_d(name ="States")+
facet_wrap(~state)+
labs(
title = "Average Monthly DSCI Trend over time by States",
y = "Average Monthly DSCI",
X = "Time")
Lastly, comparing drought trends in three States, Oregon, California, and Washington, would be interesting from my interest. So, using a subset of the data, I created a plot for each State showing monthly average DSCI trends over 23 years. In the output above, three States share the same increasing trends over time. Still, the increasing trend in monthly average DSCI is most considerable in California(CA) and most minor in Washington(WA). Oregon(OR) is in the middle.
Summarize your research question and findings below.
Answers for research questions are above with plots.
Are your findings what you expected? Why or Why not?
In summary, I found an increasing trend in average yearly DSCI values
in the United States continents over 23 years. That was clear in the US
heat map with dense redness, especially in the West Region. This result
was what I expected before starting this analysis.
The second research question focused on the drought in the West Region
with the same time frame. The output supported the evidence of severe
drought in the West Region progressing over the past 20 years, the same
conclusion as I expected.
Finally, I chose three States in the West Region based on personal
interest to compare drought-level trends. Based on these time-series
plots, I could say there is an increasing tendency over time for all
three States with different magnitudes. However, it still needs
appropriate statistical analysis of time-series data to support this
conclusion.