From 8977152b8dc46e9629e88c2e7d77fec4b12c9648 Mon Sep 17 00:00:00 2001 From: Tianyu Yao Date: Wed, 4 May 2022 18:10:33 -0400 Subject: [PATCH 1/2] update 02 --- 02-data.Rmd | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/02-data.Rmd b/02-data.Rmd index 60fdb58..c7aa6eb 100644 --- a/02-data.Rmd +++ b/02-data.Rmd @@ -6,7 +6,7 @@ Tianyu is responsible to determine the cryptocurencies we are focus on. Juntian ## Data Collection -We use [coinmarketcap](https://coinmarketcap.com) api to verified top 5 cryptocurrencies by market cap, and then pulled each cryptocurrency dataset from yahoo finance, including 1 year long trading information (2021 May - 2022 May). +We use [coinmarketcap](https://coinmarketcap.com) api to verified top 5 cryptocurrencies by market cap, and then pulled each cryptocurrency dataset from [coingecko](https://www.coingecko.com), including historical trading information. ```{r} library(tidyverse) @@ -26,26 +26,23 @@ We then explore multiple financial sources and yahoo finance is the most tangibl ## Dataset Information -We downloaded 5 datasets and each corresponding to one of the major cryptocurrencies we observed above. Each dataset include 366 rows and 7 columns. +We downloaded 5 datasets and each corresponding to one of the major cryptocurrencies we observed above. Each dataset include 7 columns and up to 3292 rows. ### Format The format of each dataset: -| Date | Open | High | Low | Close | Adj.Close | Volume | -| :----: | :----: | :----: | :----: | :----: | :----: | :----: | +| date | price | market_cap | total_volume | +| :----: | :----: | :----: | :----: | | record | ### Column Details -* Date: date of the crypto record -* Open: open price -* High: highest price -* Close: close price -* Adj.Close: close price after adjustment -* Volume: the number of shares traded +* date: date of the crypto record +* price: trading price (USD) +* market_cap: total market cap +* total_volume: the number of shares traded -(all price are in USD) ## Issue From 40bb996ecf028d9a75df7471be0be0cf0beb164a Mon Sep 17 00:00:00 2001 From: Tianyu Yao Date: Wed, 4 May 2022 18:44:34 -0400 Subject: [PATCH 2/2] update missing value analysis --- 04-missing.Rmd | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/04-missing.Rmd b/04-missing.Rmd index 66adb2b..2c9cdd4 100644 --- a/04-missing.Rmd +++ b/04-missing.Rmd @@ -5,8 +5,16 @@ library(redav) ``` ```{r} -plot_missing(df.close) +plot_missing(df.cryto) ``` -Observed that there is no missing value. +Observed that volatility and return contains missing values. Take a look into number of missing value in columns we found: + +```{r} +colSums(is.na(df.cryto)) %>% + sort(decreasing = TRUE) +``` + + +Calculating volatility requires 15 days data beforehand and return need 1 day data beforehand. Thus each cryptocurrency return variable missed one rows and volatility missed 15 rows of data. In total we have 75 volatility values and 5 return values missing.