diff --git a/brush-and-link.md b/brush-and-link.md new file mode 100644 index 0000000..4404489 --- /dev/null +++ b/brush-and-link.md @@ -0,0 +1,21526 @@ +--- +title: Brush and link +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-brush-and-link.html +folder: vega-in-r +series: vega-in-r-series +weight: 10 +--- + +We have seen how to concatenate charts, we may now see how to connect them interactively. The steps described here are based on the [Interactive Examples article](https://vegawidget.github.io/altair/articles/interactive.html). +- To get started, we create a static chart using the step 1 of the code below. +- The second step is to add a selection. Here we add a selection of type `interval` but a selection can also be `single` or `multi` [altair.selection reference](https://altair-viz.github.io/user_guide/generated/api/altair.selection.html?highlight=selection#altair.selection). In step 2, we also name the selection 'brush' and we update the static chart by including the selection function. +- Finally, in step 3, we update the color encoding of the data, so that every time the chart is brused the data inside the selection are colored, based on the nominal variable `Missing`, and the rest are colored lightgray. + +```R +data_source_modified_subset = subset(data_source_modified, data_source_modified$Entity != "All natural disasters") + +# step1 +domain_color = c("0", "1") +range_color = c('black', 'red') + +chart_static = alt$Chart(data_source_modified_subset)$ + mark_circle( + opacity = 0.8, + size = 50 + )$ + encode( + x = 'Year:O', + y = 'Deaths:Q', + color = alt$Color('Missing', scale=alt$Scale(domain = domain_color, range = range_color)) + )$ + properties( + height = 300, + width = 600 + ) + +# step2 +brush = alt$selection_interval() +chart_brush = chart_static$add_selection(brush) + +# step3 +chart_1 = chart_brush$encode( + color = alt$condition(brush, "Missing:N", alt$value("lightgray")) + ) +``` + +
+ + +
+ +Now that we have created our brushable chart, we may decrease the size of the first chart, so that we can easily display two charts in the screen. +If we now make a second chart that inherits all the properties of the first chart but we only change the x position encoding to `Entity`, we get the two-way brushable and linkable visalisation below. + +```R +chart_2a = chart_1$properties(width = 300, height = 300) +chart_2b = chart_2a$encode(x = "Entity:N") + +chart_disasters = (chart_2a | chart_2b) +``` + +
+ + +
+ +{:.exercise} +**Exercise** - Make a one-way brushable and linkable chart of the deaths versus time per entity. The interval selection appears in a barchart below. Hint: Check the [altair R gallery of interactive charts](https://vegawidget.github.io/altair/articles/example-gallery-08-interactive-charts.html) + + +
+ + + + +{% include custom/series_vega-in-r_next.html %} diff --git a/changing-data.md b/changing-data.md new file mode 100644 index 0000000..18a21e3 --- /dev/null +++ b/changing-data.md @@ -0,0 +1,284 @@ +--- +title: Changing data +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-changing-data.html +folder: vega-in-r +series: vega-in-r-series +weight: 3 +--- +Let's now use a more realistic example and visualize a dataset included in the `vega_datasets` package [https://github.com/vega/vega-datasets.html](https://github.com/vega/vega-datasets.html). + +We can import the vega datasets using the altair library. +```R +vega_data = altair::import_vega_data() +``` + +Check out the list of the available datasets: +```R +vega_data$list_datasets() +``` + +and select the one you want to work with. Here, we are using the dataset for [Natural Disasters from Our World in Data](https://ourworldindata.org/natural-disasters.html). +```R +data_source = vega_data$disasters() +``` + +Alternatively, you may load data from a local file using standard R code, or read the data from a url using: +```R +data_source = read.csv(url("https://raw.githubusercontent.com/vega/vega-datasets/master/data/disasters.csv")) +``` + +After importing the data, we can take a first look using standard R code: + +```R +str(data_source) +summary(data_source) +head(data_source); tail(data_source) +``` + +```R +> str(data_source) +'data.frame': 803 obs. of 3 variables: + $ Entity: Factor w/ 11 levels "All natural disasters",..: 1 1 1 1 1 1 1 1 1 1 ... + $ Year : int 1900 1901 1902 1903 1905 1906 1907 1908 1909 1910 ... + $ Deaths: int 1267360 200018 46037 6506 22758 42970 1325641 75033 1511524 148233 ... +> summary(data_source) + Entity Year Deaths + All natural disasters:117 Min. :1900 Min. : 1 + Earthquake :111 1st Qu.:1946 1st Qu.: 270 + Extreme weather :111 Median :1975 Median : 1893 + Flood : 89 Mean :1969 Mean : 81213 + Landslide : 79 3rd Qu.:1996 3rd Qu.: 10362 + Epidemic : 69 Max. :2017 Max. :3706227 + (Other) :227 +> head(data_source) + Entity Year Deaths +1 All natural disasters 1900 1267360 +2 All natural disasters 1901 200018 +3 All natural disasters 1902 46037 +4 All natural disasters 1903 6506 +5 All natural disasters 1905 22758 +6 All natural disasters 1906 42970 +> tail(data_source) + Entity Year Deaths +798 Wildfire 2012 21 +799 Wildfire 2013 35 +800 Wildfire 2014 16 +801 Wildfire 2015 67 +802 Wildfire 2016 39 +803 Wildfire 2017 75 +``` + +We can now make an altair R plot similar to the one at altair Python [https://altair-viz.github.io/gallery/natural_disasters.html](https://altair-viz.github.io/gallery/natural_disasters.html) +For now, we may filter the data in R and use the subset of the data to make the chart. On the data transform section we will see how to do the filtering inside the chart specification. + +
+ +
+ + + +Below is the code to make this plot. +```R +data_source_subset = subset(data_source, data_source$Entity != "All natural disasters") + +chart_disasters = alt$Chart(data_source_subset)$ + mark_circle( + opacity=0.8, + stroke='black', + strokeWidth=1 + )$ + encode( + x = "Year:O", + y = "Entity:N", + color = "Entity:N", + size = "Deaths:Q" + )$ + properties( + height=200, + width=500 + ) +``` +Here, the global properties of the circles are specified inside the mark attribute while the properties that depend on the data inside the encoding. +Using the mark type `rect` with `color` and `opacity` channels we can make a heatmap plot. + + +```R +chart_disasters = alt$Chart(data_source_subset)$ + mark_rect()$ + encode( + x = "Entity:O", + y = "Year:O", + color = "Entity:N", + opacity = 'Deaths:Q' + )$ + properties( + height=600, + width=200 + ) +``` + + +
+ + +Next, using the code below, we can make a time series plot of deaths from all natural disasters from 1900 until 2017. + +```R +data_source_subset = subset(data_source, data_source$Entity == "All natural disasters") + +chart_disasters = alt$Chart(data_source_subset)$ + mark_line()$ + encode( + x='Year:Q', + y='Deaths:Q', + tooltip = c("Year", "Deaths") + )$ + properties( + height=300, + width=600 + ) +``` + +
+ + + +{:.exercise} +**Exercise** - Use the `color` channel to make a time series plot per Entity. + +{:.exercise} +**Exercise** - Change the field types. What is the result? + + +{% include custom/series_vega-in-r_next.html %} diff --git a/data-transformations.md b/data-transformations.md new file mode 100644 index 0000000..4a13fe0 --- /dev/null +++ b/data-transformations.md @@ -0,0 +1,205 @@ +--- +title: Data Transform +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-data-transformations.html +folder: vega-in-r +series: vega-in-r-series +weight: 6 +--- + +As mentioned in the [data transformations documentation](https://altair-viz.github.io/user_guide/transform/index.html) of altair, in most cases, it is suggested to perform transformations outside the chart definition, so in our case using R. Of course, data transforms inside the chart can also be useful in some cases. +So far, we have been filtering the data in R and then using the modified data in the chart specification. Now, we use the `transform_filter()` to subset the data inside the chart. [Here](https://altair-viz.github.io/user_guide/transform/filter.html#user-guide-filter-transform) is the filter transfrom documentation. We make the linechart we have seen in a previous section using the code below: + +```R +chart_disasters = alt$Chart("https://raw.githubusercontent.com/vega/vega-datasets/master/data/disasters.csv")$ + mark_line()$ + encode( + x = 'Year:Q', + y = 'Deaths:Q', + tooltip = c("Year", "Deaths") + )$ + properties( + height = 300, + width = 600 + )$ + transform_filter( + alt$FieldEqualPredicate(field = "Entity", equal = "All natural disasters") + ) +``` + +
+ + +
+ +We use the field predicates to assess whether a data point satisfied certain conditions. As mentioned in the [field predicates reference](https://altair-viz.github.io/user_guide/transform/filter.html#field-predicates) the `FieldEqualPredicate` evaluates whether a field is equal to a particular value. The variable is the first argument and the condition is the second argument. Go through the field predicates altair documentation and [vega-lite documentation](https://vega.github.io/vega-lite/docs/predicate.html#field-predicate) and use the `FieldOneOfPredicate` for the exercise below. + + +{:.exercise} +**Exercise** - Use the filter transform to obtain the data related to volcanic activity and earthquake and make an area chart like the one below. + +
+ + +
+ +We now also use the `transform_window()` for data transformation to compute and plot a windowed aggregation of the deaths over all available years. [Here](https://altair-viz.github.io/user_guide/transform/window.html#user-guide-window-transform) is the window transform documentation. + + +```R +chart_disasters = alt$Chart("https://raw.githubusercontent.com/vega/vega-datasets/master/data/disasters.csv")$ + transform_window( + cumulative_count='sum(Deaths)' + )$ + mark_area()$ + encode( + x = 'Year:O', + y = 'cumulative_count:Q', + tooltip = c("Year:Q", 'cumulative_count:Q') + )$transform_filter( + alt$FieldEqualPredicate(field = "Entity", equal = "All natural disasters") + )$ + properties( + height = 300, + width = 600 + ) +``` + +
+ + + + +{% include custom/series_vega-in-r_next.html %} diff --git a/layered-chart.md b/layered-chart.md new file mode 100644 index 0000000..6f8b153 --- /dev/null +++ b/layered-chart.md @@ -0,0 +1,8000 @@ +--- +title: Layered chart +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-layered-chart.html +folder: vega-in-r +series: vega-in-r-series +weight: 7 +--- + +We are now looking at how we can combine charts in one canvas in altair R. This section on how to combine charts, is based on the [View Composition article](https://vegawidget.github.io/altair/articles/view-composition.html). + +First, we are discussing a layered chart example. Each chart can be made separately and combined uisng the `+` operator: `chart_layer_1 + chart_layer_2`. +Below, we make a chart that consists of a line layer and a point layer. The tooltip should be added to the top layer so that it is visible in the final chart. + +```R +chart_disasters_L1 = alt$Chart("https://raw.githubusercontent.com/vega/vega-datasets/master/data/disasters.csv")$ + mark_line()$ + encode( + x = 'Year:O', + y = 'Deaths:Q' + )$ + transform_filter( + alt$FieldEqualPredicate(field = "Entity", equal = "All natural disasters") + )$ + properties( + height = 400, + width = 600 + ) + +chart_disasters_L2 = alt$Chart("https://raw.githubusercontent.com/vega/vega-datasets/master/data/disasters.csv")$ + mark_point()$ + encode( + x = 'Year:O', + y = 'Deaths:Q', + tooltip = "Deaths:Q" + )$ + transform_filter( + alt$FieldEqualPredicate(field = "Entity", equal = "All natural disasters") + )$ + properties( + height = 400, + width = 600 + ) + +chart_disasters = (chart_disasters_L1 + chart_disasters_L2) +``` + + +
+ + +
+ +To produce the same chart we may also follow the procedure below. +Use the R code provided above for `chart_disasters_L2` and then make the first layer using: + +```R +chart_disasters_L1 = chart_disasters_L2$mark_line() +``` + +Then superimpose the charts as before: +```R +chart_disasters = (chart_disasters_L1 + chart_disasters_L2) +``` + +{:.exercise} +**Exercise** - Make a layered chart, which consists of one layer for the bar mark and one for the text mark, to produce the chart below. + + + +
+ + + +{% include custom/series_vega-in-r_next.html %} diff --git a/setting-things-up.md b/setting-things-up.md new file mode 100644 index 0000000..41fcc35 --- /dev/null +++ b/setting-things-up.md @@ -0,0 +1,60 @@ +--- +title: Setting things up +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-setting-things-up.html +folder: vega-in-r +series: vega-in-r-series +weight: 1 +--- + +To get this tutorial started, we need, first, to install anaconda, and second, activate an r-reticulate environment using conda. We also need to install vega_datasets using pip and finally, install the R packages reticulate and altair using install.packages() in Rstudio. +Most of the steps described here are taken from [altair R installation](https://vegawidget.github.io/altair/articles/installation.html). Make sure you are using Python version 3.5 or higher to comply to the system requirements of the altair R package [altair CRAN](https://cran.r-project.org/web/packages/altair/altair.pdf). + +First, [install Anaconda](https://www.anaconda.com/distribution/) and after installing Anaconda, open the Anaconda Prompt. +Update conda: +``` C +conda -V +conda update conda +```` + +Install the vega datasets that we will be used in this tutorial: +``` C +pip install vega_datasets +``` + +Next, create and activate a conda environmnet called `r-reticulate`: +``` C +conda create -n r-reticulate +conda activate r-reticulate +``` + +Open Rstudio IDE and install reticulate. Then, use the conda environment `r-reticulate`: +``` R +install.packages("reticulate") +reticulate::use_condaenv("r-reticulate") +``` + +Restart Rstudio and then install the altair package: +``` R +install.packages("altair") +``` + +In Rstudio, use the code below to install the Python packages altair and vega_datasets: +``` R +altair::install_altair() +``` + +Verify the installation using: +``` R +altair::check_altair() +``` + +If there is no error on the verification, we are ready to start! +
+The procedure described above should be run only in the beginning. The following times you want to use altair in Rstudio, you only need to call `library("altair")`. + + + + +{% include custom/series_vega-in-r_next.html %} diff --git a/simple-barchart.md b/simple-barchart.md new file mode 100644 index 0000000..66c9167 --- /dev/null +++ b/simple-barchart.md @@ -0,0 +1,193 @@ +--- +title: A simple barchart +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-simple-barchart.html +folder: vega-in-r +series: vega-in-r-series +weight: 2 +--- +Here is a very simple barchart defined in altair R. + +
+ + +The dataset used for this chart is: + +```R +Var1 = c("a","b","c","d","e") +Var2 = c(11, 19, 22, 8, 14) +Var3 = c("type1","type1","type2","type1","type2") +dataset = data.frame(Var1, Var2, Var3) +``` + +and below is the code to generate it: + + +```R +chart_1 = alt$Chart(dataset)$ + mark_bar()$ + encode( + x = "Var1:O", + y = "Var2:Q" + )$ + properties( + height=200, + width=400 + ) +``` + +What is the syntax in altair R? It is similar to the altair Python with the major difference the usage of the operator `$` to access attributes, instead of `.`. We should note that there are some other differences of the Python and the R package described at the [Field Guide to Python Issues](https://vegawidget.github.io/altair/articles/field-guide-python.html) together with examples. Below, are the most common properties of the chart syntax: +- We first need to use the object `alt` to access the Altair API and create a chart using `alt$Chart`. +- The `data` to be visualised is called inside the `alt$Chart`. +- The `mark` used is specifed after `mark_`. Values as properties of the marks, for instance a hard-coded size or color, can be specified here. +- The `encode` determines the mapping between the channels and the data fields. The encoding that is dependent on the fields is specified here, not the encoding that has to do with values of the marks. Here, `x` and `y` are the position channels. The field type is specified after the field name. `O` stands for ordinal and `Q` for quantitative. Other types are `N` for nominal, `T` for temporal and `G` for goejson. The `x = "Var1:O"` is the short form of `x = alt$X("Var1", type = "ordinal")`. The two forms are equivalent but the long form is used when doing more adjustments unside encoding. We will see an example in the field transform section. +- The height and width of the plot is specified inside `properties`. + +For more detailed references in Altair classes and functions, we may look to the [API reference](https://altair-viz.github.io/user_guide/API.html). + +Now, to display the chart in Rstudio we may use `vegawidget(chart_1)` or `chart_1`. Alternatively, we can also save the chart using: +```R +htmlwidgets::saveWidget(vegawidget(chart_1),'chart_1.html') +``` +and display it in the browser by opening the `chart_1.html` file. + +To examine the chart specification in R we can install the package listviewer using `install.packages("listviewer")` and use: +```R +vegawidget::vw_examine(chart_1, mode = "code") +``` +The output is below: + +![screenshot_code]({{ site.baseurl }}/assets/screenshot_code.png) + + +{:.exercise} +**Exercise** - Make yourself comfortable with the basic syntax of the chart in the altair R. Use the color channel for `Var3` to make the chart below. Change the height and width of the panel. + + +
+ + + +{:.exercise} +**Exercise** - Visualise the same data, using a point as the mark, change the color for all points to black and visualise Var3 using size. Format the axes. + + + +{% include custom/series_vega-in-r_next.html %} diff --git a/simple-interaction.md b/simple-interaction.md new file mode 100644 index 0000000..ce49ff0 --- /dev/null +++ b/simple-interaction.md @@ -0,0 +1,191 @@ +--- +title: Simple Interaction +keywords: vega-in-r +sidebar: vega-in-r_sidebar +permalink: /vega-in-r-simple-interaction.html +folder: vega-in-r +series: vega-in-r-series +weight: 4 +--- + +One of the main advantages to use the altair package is the fact that supports the generation of interactive graphics. The code required for adding a simple interaction is relatively short. + +## Tooltip + +A tooltip can be added to the plot using `tooltip()` inside `encode()` [altair R tooltips](https://vegawidget.github.io/altair/articles/tooltips.html). For one variable displayed in the tooltip we can use: + +```R +... +tooltip = "Variable_1:T" +... +``` + +and for more than one variable, we can use the R function `list()` or `c()` as illustrated below: + +```R +... +tooltip = c("Variable_1:T", "Variable_2:T") +... +``` + +Mind that if we are importing the data from a url directly in the plot specification, we may need to specify the field type. As shown above we may use "T" for the type, where "T" may be for instance `O` for orninal, `Q` for quantitative or `N` for nominal. +We may also use the long form `alt$Tooltip(field = "Entity", type = "nominal")` and get the same result, or modify the tooltip specifying for instance a title, using `alt$Tooltip(field = "Entity", type = "nominal", title = "Disaster")`. + +
+ + +{:.exercise} +**Exercise** - Add a tooltip in the heatmap we created in the previous section, to get the graph illustrated above. + + +## Zooming and Panning + +We illustrate two ways of making a graph zoomable and pannable. The first one is by adding the `intreactive()` attribute, as illustrated below: + +```R +chart = alt$Chart(data_source_subset)$ + ..... + $interactive() +``` + +A second option is to specify the selection outside the plot code and then use it inside the `add_selection` attribute in the chart code. +The second option is an interval selection using a scale binding. For more information on selection types supported in altair you can refer to [altair.selection_interval reference](https://altair-viz.github.io/user_guide/generated/api/altair.selection_interval.html#altair.selection_interval) + +```R +selection = alt$selection_interval(bind='scales') + +chart = alt$Chart(data_source_subset)$ +..... +$add_selection( + selection + ) +``` + +
+ + + +{:.exercise} +**Exercise** - Make the time series plot of all natural distasters interactive, to get the graph illustrated above. Use both ways of making it zoomable and pannable. + + +{% include custom/series_vega-in-r_next.html %}