-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy path01-Week1.Rmd
More file actions
266 lines (175 loc) · 11.2 KB
/
01-Week1.Rmd
File metadata and controls
266 lines (175 loc) · 11.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
---
title: "01-Week1.Rmd"
author: "M. Armstrong"
date: "2025-03-19"
output:
bookdown::html_book:
toc: yes
css: toc.css
---
```{r setupq1, include=FALSE}
knitr::opts_chunk$set(comment = "#>", echo = TRUE, fig.width=6)
```
# Week 1- Welcome!
Welcome to Introduction to Genomics in Natural Populations at UC Davis!
This class has code modified from the Data Carpentry's introduction to the shell tutorial, which can be found here <https://datacarpentry.github.io/shell-genomics/01-introduction.html> This tutorial is also modified from previous tutorials compiled by Serena Caplins: <https://baylab.github.io/MarineGenomics/>
## Main Objectives
- Take Pre-class assessment (link on syllabus)
- Introduction to genomics & shell computing (via slideshow here: [Week 1 Slides](./figs/W1_introcoding.pdf)
- Learn how to access terminal via Farm OnDemand
- Learn how to interact with the command line interface and move around in your file system
## How to access the shell via Farm OnDemand
Instructions on how to get your terminal window running are on this PDF and we will walk through this together in class:
https://canvas.ucdavis.edu/courses/978599/files?preview=27151921
## Navigating your file system
The part of the operating system responsible for managing files and directories is called the file system. It organizes our data into files, which hold information, and directories (also called “folders” but we will call them directories in class), which hold files or other directories.
Several commands are frequently used to create, inspect, rename, and delete files and directories.
``` html
$
```
The dollar sign is a prompt, which shows us that the shell is waiting for input; your shell may use a different character as a prompt and may add information before the prompt. When typing commands, either from these lessons or from other sources, do not type the prompt, only the commands that follow it.
Let’s find out where we are by running a command called pwd (which stands for “print working directory”). At any moment, our current working directory is our current default directory, i.e., the directory that the computer assumes we want to run commands in, unless we explicitly specify something else. Here, the computer’s response is /home/yourusername, which is the top level directory within our cloud system. Below I show what it looks like from my end with my username, madarm11:
``` html
$ pwd
```
``` html
/home/madarm11
```
Let’s look at how our file system is organized. We can see what files and subdirectories are in this directory by running `ls`, which stands for “list”. It may show other things than just the directories listed below if you have been in Farm before (which is why mine looks different below)
``` html
$ ls
```
``` html
Desktop Mambaforge-Linux-x86_64.sh Public get-pip.py miniconda.sh
Documents Music Templates mambaforge ondemand
Downloads Pictures Videos miniconda
```
`ls` prints the names of the files and directories in the current directory in alphabetical order, arranged neatly into columns.
First we will need to navigate to the correct directory that we will be working in for this class. The command to change locations in our file system is cd, followed by a directory name to change our working directory. cd stands for “change directory”.
Our class directory is in group/rbaygrp/eve198-genomics. ALL work for this class will take place inside the eve198-genomics directory If we want to navigate into eve198-genomics that is housed in the rbaygrp directory inside the group directory we can use the following command to get there:
``` html
$ cd /group
$ cd rbaygrp
$ cd eve198-genomics
```
``` html
$ pwd
```
``` html
/group/rbaygrp/eve198-genomics
```
Use `ls` to see what is inside eve198-genomics
``` html
$ ls
```
We have a special command to tell the computer to move us back or up one directory level. If we want to navigate out of eve198-genomics we can do the following command:
``` html
$ cd ..
```
The two periods after cd takes us back one directory. Type ls to see what other directories are in the rbaygrp directory.
What happens when you type in just cd?
``` html
$ cd
```
We are back to where we started!
## Shortcut: Tab Completion
Typing out file or directory names can waste a lot of time and it’s easy to make typing mistakes. Instead we can use tab complete as a shortcut. When you start typing out the name of a directory or file, then hit the Tab key, the shell will try to fill in the rest of the directory or file name.
Since we are already in our home directory type 'cd gr' and then hit tab:
``` html
$ cd /gr
```
The shell will fill in the rest of the directory name for `group`.
Now continue into the rbaygrp directory and then the eve198-genomics directory by using tab to fill in the rest of the names:
``` html
$ cd rbayg
$ cd ev
```
Using tab complete can be very helpful. However, it will only autocomplete a file or directory name if you’ve typed enough characters to provide a unique identifier for the file or directory you are trying to access.
## Creating our own directories
We will now each make our own directory in the eve198-genomics directory so that everyone can run code individually.
To make a new directory type the command `mkdir` followed by the name of the directory, in this case 'mlarmstrong'. You can do your ucdavis email name or your first intial and last name, just make sure it is specific to you.
``` html
$ mkdir mlamstrong
```
Check that it's there with ls. As other people make directories those will pop up too!
``` html
$ ls
```
``` html
mlarmstrong week1
```
## Full vs Relative Paths
The `cd` command takes an argument which is a directory name. Directories can be specified using either a relative path or a full absolute path. The directories on the computer are arranged into a hierarchy. The full path tells you where a directory is in that hierarchy.
Navigate to the home directory, then enter the `pwd` command.
``` html
$ cd
$ pwd
```
You should see home/'yourusername':
``` html
/home/madarm11
```
This is the full name of your home directory. This tells me that I are in a directory called `madarm11`, and yours will be named with your username. This directory sits inside a directory called home which sits inside the very top directory in the hierarchy. The very top of the hierarchy is a directory called `/` which is usually referred to as the `root` directory. So, to summarize: `madarm11` is a directory in `home` which is a directory in /. More on `root` and `home` in the next section.
Now enter the following command:
``` html
$ cd /group/rbaygrp/eve198-genomics
```
This jumps several levels to the class directory. Now go back to the home directory.
``` html
$ cd
```
But you can't navigate to the class directory using this command:
``` html
$ cd group/rbaygrp/eve198-genomics
```
Why not? The first uses the absolute path, giving the full address from the home directory. The second uses a relative path, giving only the address from the working directory. A full path always starts with a /. A relative path does not.
A relative path is like getting directions from someone on the street. They tell you to “go right at the stop sign, and then turn left on Main Street”. That works great if you’re standing there together, but not so well if you’re trying to tell someone how to get there from another country. A full path is like GPS coordinates. It tells you exactly where something is no matter where you are right now. We need to have a full path since we need to be able to navigate to the broadest directory using `cd` from our home directory, and then move into the group directory. using `cd /group`
Now navigate inside of your individual directory and run the command 'ls'.
``` html
$ cd mlarmstrong
$ ls
```
It should be empty because we just created it and haven't put anything in it yet. Next class we will learn to work with files!
Let's look at some of the options for the `ls` function using the `man` command (note this will print out several lines of text). `man` (short for manual) displays detailed documentation (also referred as man page or man file) for `bash` commands. It is a powerful resource to explore bash commands, understand their usage and flags. Some manual files are very long. You can scroll through the file using your keyboard’s down arrow or use the Space key to go forward one page and the `b` key to go backwards one page. When you are done reading, hit `q` to quit.
```html
$ man ls
```
The `-a` option is short for all and says that it causes `ls` to “not ignore entries starting with .” This is the option we want.
```html
$ ls -a
```
You'll see there are many more files shown now that we can look at the hidden ones.
In most commands the flags can be combined together in no particular order to obtain the desired results/output.
```html
$ ls -Fa
$ ls -laF
```
## Examining the contents of other directories
By default, the `ls` commands lists the contents of the working directory (i.e. the directory you are in). You can always find the directory you are in using the `pwd` command. However, you can also give `ls` the names of other directories to view. Navigate to your directory if you are not already there.
```html
$ cd /group/rbaygrp/eve198-genomics/
```
Then enter the command:
```html
$ ls 'yourdirectory'
data
```
This will list the contents of the your directory without you needing to navigate there.
If you want to move around multiple directories without necessarily looking in each one, you can chain these together like so:
```html
$ cd ../../
```
moves you back two directories. If you just want to look in that directory but not move there you can change the command to:
```html
$ ls ../../
```
We now know how to move around our file system using the command line. This gives us an advantage over interacting with the file system through a GUI as it allows us to work on a remote server, carry out the same set of operations on a large number of files quickly, and opens up many opportunities for using bioinformatic software that is only available in command line versions.
## Group Work Activity- Treasure Hunt!
Let's practice moving around in directories. Form a small group with nearby classmates but make sure everyone practices navigating on their own computer.
Start in your home directory. Then navigate to the week1_activity directory in the eve198-genomics directory. Move around directories until you find a directory with the scientific name of the Wolverine (which you can google the name)! Once you find it, copy the full path to that directory and screenshot the contents of the directory. Submit both the full path and the screenshot on canvas under the 'Assignments' tab for 'Week 1: Treasure Hunt'.
## Key Points
- Most commands take options (flags) which begin with a -.
- The shell gives you the ability to work more efficiently by using keyboard commands rather than a GUI.
- Useful commands for navigating your file system include: ls, pwd, and cd.
- Tab completion can reduce errors from mistyping and make work more efficient in the shell.
- Relative paths specify a location starting from the current location, while absolute paths specify a location from the root of the file system.