Skip to content

Commit

Permalink
add eda_caribou
Browse files Browse the repository at this point in the history
  • Loading branch information
perlatex committed Jun 26, 2020
1 parent 4c7a90a commit 413af02
Show file tree
Hide file tree
Showing 4 changed files with 96 additions and 3 deletions.
63 changes: 62 additions & 1 deletion eda_caribou.Rmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# 探索性数据分析6 {#eda06}

本章我们分析加拿大哥伦比亚林地**驯鹿追踪数据**,数据包含了从1988年到2016年期间260只驯鹿,近250000个位置标签。

## 驯鹿位置跟踪

Expand All @@ -8,11 +9,24 @@ knitr::include_graphics("images/caribou_location.png")
```


[驯鹿位置跟踪数据](https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-06-23/readme.md),包含了两个数据集
大家可以在[这里](https://github.com/tacookson/data/tree/master/caribou-location-tracking)了解数据集的信息,它包含了两个数据集


```{r, eval=FALSE}
# devtools::install_github("thebioengineer/tidytuesdayR")
library(tidytuesdayR)
tuesdata <- tidytuesdayR::tt_load('2020-06-23')
# or
# tuesdata <- tidytuesdayR::tt_load(2020, week = 26)
```



```{r message=FALSE, warning=FALSE}
library(tidyverse)
library(lubridate)
library(gganimate)
individuals <- readr::read_csv('./demo_data/caribou/individuals.csv')
locations <- readr::read_csv('./demo_data/caribou/locations.csv')
Expand Down Expand Up @@ -47,6 +61,10 @@ individuals %>%
```


## 性别比例


## 每个站点运动最频繁的前10的驯鹿


## 驯鹿的活动信息
Expand Down Expand Up @@ -98,6 +116,10 @@ example_animal %>%
labs(title = "一只小驯鹿到处啊跑")
```

## 季节模式
看看驯鹿夏季和冬季运动模式


## 迁移速度

```{r}
Expand Down Expand Up @@ -139,3 +161,42 @@ example_animal %>%
```

## 更多

```{r}
df <- locations %>%
filter(study_site == "Graham",
year(timestamp) == 2002) %>%
group_by(animal_id) %>%
filter(as_date(min(timestamp)) == "2002-01-01",
as_date(max(timestamp)) == "2002-12-31") %>%
ungroup() %>%
mutate(date = as_date(timestamp)) %>%
group_by(animal_id, date) %>%
summarise(longitude_centroid = mean(longitude),
latitude_centroid = mean(latitude)) %>%
ungroup() %>%
complete(animal_id, date) %>%
arrange(animal_id, date) %>%
fill(longitude_centroid, latitude_centroid, .direction = "down")
```


```{r}
p <- df %>%
ggplot(aes(longitude_centroid, latitude_centroid, colour = animal_id)) +
geom_point(size = 2) +
coord_map() +
theme_void() +
theme(legend.position = "none") +
transition_time(time = date) +
shadow_mark(alpha = 0.2, size = 0.8) +
ggtitle("Caribou location on {frame_time}")
p
```


```{r}
p
```

13 changes: 12 additions & 1 deletion forcats.Rmd
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# 因子型变量 {#forcats}

本章介绍R语言中的因子类型数据。
本章介绍R语言中的因子类型数据。因子型变量在数据处理和可视化中,应用很广泛。

## 什么是因子

R把表示分类的数据称为因子,比如人可以分为:男人和女人。由于因子比字符串更方便处理,所以在R 4.0之前,将字符串类型默认为因子类型,但这个默认也带来一些不方便,因此在R
R 4.0之后取消了这个默认。在tidyverse集合里,有专门处理因子的宏包`forcats`,因此,本章将围绕`forcats`宏包讲解如何处理因子类型变量,更多内容可以参考[这里](https://r4ds.had.co.nz/factors.html)


https://www.cnblogs.com/ljhdo/p/4911110.html、
https://r4ds.had.co.nz/factors.html

```{r}
heights <- data.frame(
Expand All @@ -18,3 +25,7 @@ class(heights$gender)
## 创建因子

## 因子水平

## 调整因子顺序

## 调整因子水平
3 changes: 2 additions & 1 deletion ggplot2_gganimate.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -457,7 +457,8 @@ tribble(

### 常用方法

我一般会保存为 gif 格式,方法类似`ggsave()`
一般用`anim_save()`保存为 gif 格式,方法类似`ggsave()`

```{r, eval=F}
animation_to_save <- diamonds %>%
ggplot(aes(carat, price)) +
Expand Down
20 changes: 20 additions & 0 deletions tidyeval.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,26 @@ grouped_mean <- function(data, group_var, summary_var) {
grouped_mean(mtcars, cyl, mpg)
```


dplyr1.0之后,可以这样写

```{r, eval=FALSE}
sum_group_vars <- function(df,
group_vars,
sum_vars){
df %>%
group_by(across({{ group_vars }})) %>%
summarise(n = n(),
across({{ sum_vars }},
list(mean = mean, sd = sd))
)
}
sum_group_vars(mpg, c(model, year), c(hwy, cty))
```


下面我们讲讲为什么要这样写。


Expand Down

0 comments on commit 413af02

Please sign in to comment.