CS_Notes/index_old.html at master · TPB-Notes/CS_Notes · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Computer Science Notes</title>
<link rel="stylesheet" href="https://stackedit.io/res-min/themes/base.css" />
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"></script>
</head>
<body><div class="container"><h1 id="computer-science-notes">Computer Science Notes</h1>


<h2 id="data-structures">Data structures</h2>


<h3 id="data-structure-and-abstract-data-types">Data structure and abstract data types</h3>

<p><strong>Abstract data type</strong> A conceptual model of how data can be stored and operations can be carried out on the data</p>

<p><strong>Data structure</strong> A common format for storing large volumes of related data, which is an implementation of an abstract data type</p>

<p><strong>Array</strong> A set of related data items stored under a single identifier</p>


<h3 id="files">Files</h3>

<p>Two common portable formats are text files and binary files. <br>
A text file is one that contains lines of text, or other human readable characters, where each line is usually referred to as a record. <br>
A text file may also contain a header, which explains the content and structure of the file, and an end of file (EOF) marker so that programs reading the file know when to stop. <br>
Common text file formats are .txt for raw text and .csv for structured data</p>

<p><strong>Text file</strong> A file that contains human-readable characters <br>
<strong>Binary file</strong> A file containing data as a series of binary digits  <br>
<strong>Record</strong> One line of a text file <br>
<strong>Field</strong> An item of data</p>


<h3 id="static-and-dynamic-data-structures">Static and dynamic data structures</h3>

<p><strong>Static</strong> A static data structure stores a set amount of data that is usually defined by the programmer. This is done by allocating a set of memory to the data structure. <br>
Accessing individual elements of data within a static structure is very quick as their position in memory is fixed. However, the data structure will take up memory even if it is not being used.</p>

<p><strong>Dynamic</strong> Dynamic data structures can use more or less memory as needed through the use of a heap. Unused blocks of memory are placed on the heap, which are then usable within a program. <br>
A dynamic data structure is able to take more memory off the heap if it is needed and also pu blocks of unused memory back onto the heap if it is not needed. <br>
This is a more efficient use of resources and a more flexible solution as elements can be added and removed much more easily.  <br>
Stacks, queues, and binary trees are examples are of dynamic structures.</p>

<p><strong>Queue</strong> A data structure where the first item added is the first to be removed <br>
<strong>Stack</strong> A data structure where the last item added is the first item removed <br>
<em>* Static data structure*</em> A method of storing data where the amount of data stored is fixed <br>
<strong>Dynamic data structure</strong> A method of storing data where the amount of data stored will vary as the program is run <br>
<strong>Heap</strong> A pool of unused memory that can be allocated to a dynamic data structure</p>


<h3 id="queues-and-stacks">Queues and stacks</h3>

<p><strong>Pointer</strong> A data item that identifies a particular element in a data structure</p>


<h4 id="implementing-a-stack">Implementing a stack</h4>

<p>To push an item to a stack</p>

<pre><code>def push(item):
    if stackPointer &lt; stackSize:
        stackPointer += 1
        stackArray[stackPointer] = item
    else:
        error "Stack full"
</code></pre>

<p>To pop an item from the stack</p>

<pre><code>def pop():
    if stackPointer &gt; 0:
        return stackArray[stackPointer--]
    else:
        error "Stack is empty"
</code></pre>


<h4 id="uses-of-stacks">Uses of stacks</h4>

<p>Due to the LIFO nature of stacks, they are used anywhere where the last data item is required to be the first out.</p>

<p>Reversing a list <br>
  - Iterate through the list, pushing each item to the stack <br>
  - Clear the list (or create another) <br>
  - Pop each item from the stack into the list</p>


<h4 id="stack-frames">Stack frames</h4>

<p>Stacks can be used to store information about a running program. </p>

<p><strong>Stack frame</strong> A collection of data about a subroutine call <br>
<strong>Call stack</strong> A stack used to store information about active subroutines and functions within a program</p>

<p>Procedure <br>
  - The return address is placed on the stack so that when the function is finished, the program can return to its previous address <br>
  - The function arguments are pushed to the stack- <br>
  - The previous two values, and space for the functions local variables make up a stack frame <br>
  - Once the function has been completed the program returns to the return address which is on the stack</p>

<p>This is the same mechanism used for handling interrupts and exceptions in programs. Interruptions and exceptions are events where hardware or software demand the attention of the processor and cause the current program to stop. This could be an external event such as a power failure.</p>

<p><strong>Interrupt</strong> A signal sent by a device or program to the processor, requiring its attention</p>

<p>When an interrupt happens, special blocks of code called interrupt handlers and exception handlers are loaded into memory and executed. Whilst the new demand is being dealt with the details of the first program are stored on the stack. As soon as the interrupt or exception has been dealt with, the details are taken back off the stack and the first program can carry on wherever it left off.</p>


<h4 id="nesting-and-recursion">Nesting and recursion</h4>

<p><strong>Nesting</strong> The process of putting one statement inside another</p>

<p><strong>Recursion</strong> The process of a subroutine calling itself</p>


<h4 id="queues">Queues</h4>

<p><strong>Queue</strong> A FIFO structure where data leaves in the order that it arrives</p>

<p>A common use of queues is when a peripheral is sending data to the CPU. If the CPU is not capable of dealing with the data immediately the data may be stored temporarily in a queue. <br>
Data being sent from the CPU may also be stored in a queue, for example multiple documents being sent to a printer are stored in a print queue. <br>
Queues which are used in this way are also called buffers.</p>

<p>A queue has two pointers, the front and rear pointers.</p>


<h4 id="linear-circular-and-priority-queues">Linear, circular, and priority queues</h4>

<p><strong>Linear queue</strong> A FIFO structure which is organised as a line, such as in a list</p>

<p><strong>Circular queue</strong> A FIFO structure implemented as a ring where the front and rear pointers can wrap around from the end to the start of the array</p>

<p><strong>Priority queue</strong> A priority queue adds a further element to the queue which is the priority of each item. Where two items have the same priority, they will be handled according to their position in the queue </p>

<p>An example of the use of a priority queue would be the print queue for a network printer, where it might be possible for some users to prioritise their own documents over those of others.</p>


<h4 id="implementing-a-linear-queue">Implementing a linear queue</h4>

<p>A queue is typically made up of a number of data items of the same type. Therefore a common implementation is to use an array.</p>

<p>When the queue is implemented we need to know the following <br>
  - The name of the array <br>
  - The maximum size of the queue <br>
  - Whether the queue is full or empty <br>
  - Where the front of the queue is <br>
  - Where the rear of the queue is</p>

<p>Take the queue below as an example</p>

<table>
<thead>
<tr>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>A</td>
  <td>B</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
</tr>
</tbody></table>


<p>When the first element of the queue is removed it will look like this</p>

<table>
<thead>
<tr>
  <th>-</th>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>-</td>
  <td>B</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
</tr>
</tbody></table>


<p>The front pointer has moved to position 1.</p>

<p>Any item added to the queue is added to the rear. <br>
Once an item has been added, the queue will look like this</p>

<table>
<thead>
<tr>
  <th>-</th>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>-</td>
  <td>B</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>G</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
</tr>
</tbody></table>


<p>The rear pointer has moved to the right.</p>

<p>If items are continually added the rear pointer will reach the end of the array and there will be no more available space to add new elements, despite there being space at earlier locations in the array. <br>
The simplest method to address this problem is to keep the front pointer at index 0, and to move elements forward in the array. <br>
However this is slow. As such, a circular queue is more common.</p>


<h4 id="implementing-a-circular-queue">Implementing a circular queue</h4>

<p>A circular queue works in a similar way to that described above, except that the front and rear pointers move when an item is added or removed, making more efficient use of memory.</p>

<p>Take the queue below as an example</p>

<table>
<thead>
<tr>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>A</td>
  <td>B</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
</tr>
</tbody></table>


<p>If two items are removed, the queue will look like this</p>

<p>Take the queue below as an example</p>

<table>
<thead>
<tr>
  <th>-</th>
  <th>-</th>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>-</td>
  <td>-</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
  <td>-</td>
</tr>
</tbody></table>


<p>Four new items are added to the queue</p>

<table>
<thead>
<tr>
  <th>-</th>
  <th>-</th>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>RP</th>
</tr>
</thead>
<tbody><tr>
  <td>-</td>
  <td>-</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>G</td>
  <td>H</td>
  <td>I</td>
  <td>J</td>
</tr>
</tbody></table>


<p>The rear pointer is now at the end of the array.</p>

<p>When another item is added, the pointer will wrap around as so</p>

<table>
<thead>
<tr>
  <th>RP</th>
  <th>-</th>
  <th>FP</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
  <th>-</th>
</tr>
</thead>
<tbody><tr>
  <td>K</td>
  <td>-</td>
  <td>C</td>
  <td>D</td>
  <td>E</td>
  <td>F</td>
  <td>G</td>
  <td>H</td>
  <td>I</td>
  <td>J</td>
</tr>
</tbody></table>


<p>Procedure for adding to a circular queue</p>

<pre><code>def add(item):
    rearPointer += 1
    if rearPointer == array.length: rearPointer = 0
    array[rearPointer] = item
</code></pre>

<p>Procedure for removing an item from a circular queue</p>

<pre><code>def remove():
    data = array[frontPointer]
    frontPointer += 1
    if frontPointer ==  array.length: frontPointer = 0
    return data
</code></pre>


<h4 id="implementing-a-priority-queue">Implementing a priority queue</h4>

<p>A priority queue can be implemented using an array by assigning a vale to each element to indicate priority.</p>

<p>There are two possible solutions when using an array.</p>

<p>One option is to use a standard queue, and when items are removed, each element is checked for its priority to identify the next item to be removed.</p>

<p>An alternative is to maintain the queue in priority order, which means that when a new item is added, it is put into the correct position in the queue. Removing items can then be done in the usual way.</p>


<h3 id="graphs-and-trees">Graphs and trees</h3>

<p><strong>Graph</strong> A mathematical structure that models the relationship between pairs of objects</p>

<p><strong>Graph theory</strong> The underlying mathematical principles behind the use of graphs</p>

<p><strong>Arc</strong> A join or relationship between two nodes. Also known as an edge</p>

<p><strong>Vertex</strong> An object in the graph. Also known as a node</p>

<p><strong>Weighted graph</strong> A graph that has a data value labelled on each edge</p>

<p>A graph could be used to model the relationship between two places and how they are connected via transport links. <br>
A weighted graph could be created by adding values to each arc, in this case travel times.</p>

<p><strong>Undirected graph</strong> A graph where the relationship between vertices is two-way</p>

<p><strong>Directed graph</strong> A graph where the relationship between vertices is one-way</p>


<h4 id="use-of-graphs">Use of graphs</h4>

<ul>
<li>Human networks- In a social network humans may be represented as vertices, and their relationships represented by the arcs between them</li>
<li>Transport networks- All transport works on the basis of a departure point, arrival point, and route. The departure and arrival points form the vertices and the routes form the edges. There are several applications for graph theory, including calculating quickest routes, planning timetables, scheduling and organising staff.</li>
<li>The Internet and web: It is possible to ‘map’ the Internet or the World Wide Web using graph theory. In the case of the Internet, each connected device is a vertex with the physical connection forming the edge.</li>
<li>Computer science- Latency is a key factor in communication networks. Graph theory can be used to calculate the quickest path to send data around a microprocessor where each vertex is a component and the edges are the buses that carry the data.</li>
<li>Medical research- Understanding how diseases spread is critical to their prevention. For example, if studying the spread of a flu virus, each case of flu could be a node, or more likely each location where there has been an outbreak would be a vertex. The edges would be the distance between locations. A weighted graph could be used to analyse the extent of outbreaks in particular locations and how much that then spreads between vertices.</li>
<li>Project management- Any kind of large scale projects can be modelled using a graph. This might be an engineering, construction, or IT project. In this case the vertices would be each of the actions needed to complete the project and the edges would be the relationships and dependencies which exist between the tasks.</li>
<li>Game theory- This is used in wars and conflicts to try to understand the causes of conflicts and predict the likely actions that people might take for different strategies.</li>
</ul>

<p><strong>Latency</strong> The time delay that occurs when transmitting data between devices</p>


<h4 id="adjacency-list">Adjacency list</h4>

<p>A graph is an example of an abstract data type. We need to represent it in a way that can be stored and manipulated by the computer.</p>

<p><strong>Adjacency list</strong> A data structure that stores a list of nodes with their adjacent nodes</p>

<p>In an adjacency list the value of each vertex is stored, along with the vertices that they are next to.</p>

<p>In an undirected graph all adjacencies are shown as this is a two way relationship. <br>
In a directed graph only one vertex in each pairing with have the adjacent vertex, as the relationship is one-way.</p>

<p>A weighted graph stores the value of each edge after each adjacent vertex.</p>


<h4 id="adjacency-matrix">Adjacency matrix</h4>

<p>An adjacency matrix uses a two dimensional array or grid populated with 1s and 0s.</p>

<p><strong>Adjacency matrix</strong> A data structure set up as a two dimensional array or grid that shows whether there is an edge between each pair of nodes</p>

<p>For an undirected graph the grid is populated with 1s and 0s. <br>
For a directed graph, there are 1s where there is a one-way relationship between two vertices and 0 where there isn’t. <br>
Following the same process as the undirected graph, the weight of the arc is used rather than a 1, and infinity is used rather than 0 for non-existent edges.</p>


<h4 id="comparison-of-adjacency-list-and-adjacency-matrix">Comparison of adjacency list and adjacency matrix</h4>

<table>
<thead>
<tr>
  <th>Adjacency list</th>
  <th>Adjacency matrix</th>
</tr>
</thead>
<tbody><tr>
  <td>Only stores data where there is an edge, so requires less memory</td>
  <td>Stores a value for each combination of node, and therefore uses more memory</td>
</tr>
<tr>
  <td>The list has to be parsed to identify whether particular adjacencies exist, increasing the time taken to process the data</td>
  <td>Adjacencies can be identified more quickly as every combination is already stored, effectively forming a look up table</td>
</tr>
<tr>
  <td>Where there are not many edges, forming a sparse graph, this method would be more suitable</td>
  <td>Where there are many edges, forming a dense graph, this method would be more suitable</td>
</tr>
</tbody></table>


<h4 id="trees">Trees</h4>

<p>A tree is an abstract data structure that is very similar to a graph in that it has nodes and edges. It is called a tree because it is visualised as a hierarchical structure with branches. Trees can have a root, with all other nodes branching away from the root.</p>

<p><strong>Tree</strong> A data structure similar to a graph, with no loops</p>

<p><strong>Root</strong> The starting node in a rooted tree structure from which all other nodes branch off</p>

<p><strong>Child</strong> A node in a tree that has nodes above it in the heirarchy</p>

<p><strong>Leaf</strong> A node that does not have any other nodes beneath it</p>

<p>Trees have a number of uses <br>
* Can be used to store data that has an inherent heirarchical structure. For example storing directories <br>
* Are dynamic, meaning that it is easy to add and delete nodes <br>
* Are easy to search and use standard traversal algorithms <br>
* Can be used to process the syntax of statements in natural and programming languages</p>


<h4 id="binary-search-trees">Binary search trees</h4>

<p>A common implementation of a tree is a binary search tree. <br>
This a directed and rooted tree, which can have no more than two branches off each nde and is commonly used to store data that are input in a random order. The nature of the structure means that data are automatically sorted as they are entered and that it can be ‘traversed’ in order to search for and extract data from it.</p>

<p>The first item of data to be stored is the root node. Any subsequent data is dealt with by the following routine</p>

<ul>
<li>If the value of the new data item is less than the value in the current node then branch left, otherwise branch right</li>
<li>Keep repeating this process until you come to an ‘empty’ branch. Place the new value at the end of this branch</li>
</ul>

<p><strong>Binary tree</strong> A tree where each node can only have up to two child nodes attached to it</p>

<p>A possible implementation of a binary tree is to use three arrays. The first stores the data itself, the second stores which node the left branch from a node, and the right copes with branches to the right.</p>

<p>Procedure for adding to a non-empty tree</p>

<pre><code>def add(item):
    nodeCount = nodes.length
    while nodes[nodeCount] != null:
        nodeCount++
    nodes[nodeCount] = item

   currentNode = 1
   while nodes[currentNode] != null:
        if item &lt; nodes[currentNode]:
            if left[currentNode] == 0:
                left[currentNode] = nodeCount
            currentNode = left[currentNode]
        else:
            if right[currentNode] = 0:
                right[currentNode] = nodeCount
            currentNode = right[currentNode]
</code></pre>


<h3 id="hash-tables-and-dictionaries">Hash tables and dictionaries</h3>


<h4 id="hash-tables">Hash tables</h4>

<p><strong>Hash table</strong> A data structure that stores key/value pairs based on an index calculated from an algorithm</p>

<p>A hash table is a data structure with two parts. A hash table has a table (or array) of data and a key, which is used to identify the location of the data within the table. <br>
A hashing algorithm is carried out on they key, which then cats as an index to the specific location of that data item within the array</p>

<p><strong>Hashing algorithm</strong> An algorithm which creates a unique index from given items of key data</p>


<h4 id="use-of-hashing-algorithms">Use of hashing algorithms</h4>

<p>Hashing algorithms are used in the following areas</p>

<ul>
<li>Databases- To create indices for databases</li>
<li>Memory addressing- Used to generate memoery addresses where data will be stored</li>
<li>Operating systems- Some operating systems use has tables to store and locate the executable files of its programs and utilities</li>
<li>Encryption- Hashing algorithms can be used in encryption, provided that they cannot be reverse engineered</li>
<li>Checksums- A value calculated from a set of data to be transmitted, allowing the transmitted data to be checked against the original data</li>
<li>Programming- Compilers may use hash tables to index keywords and identifiers</li>
</ul>


<h4 id="hashing-algorithms">Hashing algorithms</h4>

<ul>
<li>A numeric value needs to be generated from the data in order to perform the calculation. </li>
<li>Unique indices must be generated</li>
<li>The indices need to be uniformly distributed</li>
<li>The algorithm must be capable of producing a large enough range of indices to encompass the entire dataset</li>
</ul>

<p><strong>Collision</strong> When the algorithm produces the same index for two or more different keys</p>

<p><strong>Clustering</strong> When a hashing algorithm produces indices which are not randomly distributed</p>

<p><strong>Load factor</strong> The ratio of how many indices are available to how many there are in total</p>


<h4 id="collisions">Collisions</h4>

<p>When a collision occurs there must be a method of assigning a unique index to the key.</p>

<p><strong>Index</strong> A location where values will be stored, calculated from the key</p>

<p><strong>Chaining</strong> A technique for generating a unique index when there is a collision by adding the key/value to a list stored at the same index</p>

<p><strong>Rehashing</strong> The process of running the hashing algorithm again, or another algorithm, when a collision occurs</p>


<h4 id="dictionaries">Dictionaries</h4>

<p><strong>Dictionary</strong> A data structure that maps keys to data</p>

<p><strong>Associative array</strong> A two-dimensional structure containing key/value pairs of data</p>


<h4 id="vectors">Vectors</h4>

<p>Vectors can be implemented as values in a list</p>


<h4 id="representing-vectors-as-a-function">Representing vectors as a function</h4>

<p>A function is a construct which maps an input to an output</p>

<p>If F is the function to create a vector, S is the complete set of values which F can be applied to and R is the complete set of potential outputs of the function, then</p>


<p><script type="math/tex; mode=display" id="MathJax-Element-1137">F:S \to R</script></p>


<h4 id="representing-vectors-as-arrows">Representing vectors as arrows</h4>

<p><strong>Magnitude</strong> One of the two components of a vector, referring to its size</p>

<p><strong>Direction</strong> One of the two components of a vector</p>

<p><strong>Components</strong> The values within a vector</p>


<h2 id="the-internet">The Internet</h2>


<h3 id="the-internet-and-the-world-wide-web">The Internet and the world wide web</h3>

<p><strong>Internet</strong> A global network of networks <br>
<strong>Uniform Resource Locator (URL)</strong> A method of identifying the location of resources on the Internet</p>

<p>A URL contains a protocol, a server being addressed, an organisation name, the organisation type, a country code, and the name of the file to be accessed. <br>
The protocol can be http, https, or ftp <br>
The domain name, which is the location of the resource on the Internet <br>
The filename to locate the specific file needed. If the file is located within a subdirectory on the server, a pathname will give be given instead of just the filename</p>

<p><strong>Domain name</strong> The recognisable name of a domain on the Internet</p>

<p>Required suffixes</p>

<ul>
<li>.com Commercial organisation</li>
<li>.gov Indicates a governmental organization</li>
<li>.ac Academic institution</li>
<li>.sch School</li>
<li>.org An organisation other than a commercial business</li>
<li>.net A company providing Internet services</li>
</ul>

<p>When WWW is typed, the domain name is known as a fully qualified domain name, and is completely unambiguous as it can only relate to one host. However, it is generally note necessary to specify the host.</p>

<p><strong>Internet protocol address</strong> A unique number that identifies devices on a network</p>

<p>An IP address is a dotted quad number that identifies every computer that sends or receives data on a network and on the Internet. <br>
The protocol used to transmit data (TCP/IP) can only work with numbers. Therefore, every domain name is mapped to a number. this number is the real Internet address and identifies the computer that is transmitting or receiving data. It is called an IP address because it uses the Internet Protocol.</p>

<p>A domain name is sometimes described as a proxy for the IP address which means that the user types in a domain name which is transferred to a domain name server (DNS) which then translates the name into the IP address. An analogy could be the ability to store numbers on a mobile phone. The user selects a name from the list which is then looked at to find and dial the actual telephone number.</p>

<p>Some Ip addresses are classed as private or non-routable addresses. Typically these are the IP addresses used by devices on a private network perhaps in a home, school, or business. The IP address is needed in order to route data around the network, however it does not need to be made public as as that device is not directly connected to the Internet. It is hidden behind a router or firewall. Not-routable addresses only have to be unique within the LAN and therefore do not need to be allocated on a global basis.</p>

<p>When connection to the internet is required, the device will be connected to a router or proxy server in order to connect. in this case the IP address of that router or server needs to be a public or routable address in which case it becomes a unique address that is registered under the domain name system.</p>


<h3 id="ports">Ports</h3>

<p><strong>Port</strong> Used to identify a particular process or application on a network <br>
<strong>POP3</strong> A protocol (set of rules) for receiving emails</p>

<p>Port addresses are often used to run processes for common networking tasks and many have been assigned port numbers that are in widespread use. For example, port 25 is used for the SMTP application that checks for incoming email on an emails server. Port 110 is used for the POP3 application that fetches email from the email server.</p>

<p>There are around 250 well known ports, which are used to launch various processes, many of which are applications related to other protocols, such as FTP, DHCP, and SSH.</p>

<ul>
<li>21 FTP</li>
<li>22 SSH</li>
<li>23 Telnet</li>
<li>25 SMTP</li>
<li>53 DNS</li>
<li>80 HTTP</li>
<li>110 POP3</li>
<li>143 IMAP</li>
</ul>

<p>When a client sends a request to the server using a well-known port, the server needs to respond back to a client port and not the well known port on the client side. For example, if a server receives a request on port 80, it does not send it back to port 80 on the client. Therefore as part of the client request, a source port must also be sent so tat the server knows which port to send it back to.</p>

<p><strong>Secure Shell protocol (SSH)</strong> A protocol for remote access to computers</p>


<h3 id="network-address-translation">Network address translation</h3>

<p>The system that is used to match the private IP addresses with the public ones is called NAT. This has two main advantages. One is that a unique IP address is not needed for every single device on a network, only on the router or server that is physically connected. This means that only the public IP address needs to be registered with the DNS system. The second advantage is that there is an increased level of security as the private IP address is not being broadcast over the Internet, making that device more secure from unauthorised access.</p>

<p><strong>Domain name server (DNS)</strong> A system of connected domain name servers that provide the IP address of every website on the Internet</p>

<p>The router will track connections and maintain listings of the mappings between private IP addresses and port numbers and the corresponding public address. It does this by adding entries to a translation table which acts as a port lookup between the internal IP address and the external IP address.</p>

<p>The following is a common way that NAT can work when a workstation on the internal network wants to load data from a server on the Internet.</p>

<ul>
<li>The workstation on the internal network sends a packet to the server on the Internet to request some data, including its own internal IP address and port number in the packet so that the server knows where to return the data to</li>
<li>The router replaces the internal IP address and the port number in the packet with its external IP address and a port number that it generates. This port number will be unique to this communication, within a certain time frame.</li>
<li>The router stores the mapping information from the internal IP address and port number to external port number in the translation table</li>
<li>Data sent back from the server will be received by the router which will look up the port number in the translation table to identify which machine on the the internal network sent the request to the server</li>
<li>The router’s IP address in the packet will be replaced with the originating workstation’s IP address and port number read from the translation table</li>
<li>The reply packet can then be sent to the originating workstation</li>
<li>Where the translation table does not contain a match to a port number in a packet received from a computer on the Internet, the packet is dropped because it is not a packet sent in response to a request from within the network, and may be a hacking attempt</li>
<li>The internal IP address and port are never made public on the external network</li>
</ul>


<h3 id="port-forwarding">Port forwarding</h3>

<p>Port forwarding is commonly used when a server inside a private network, with a non-routable IP address, is used to provide services to clients on the Internet. As the server as a non-routable IP address, it cannot be accessed directly from the Internet. Therefore the client on the Internet must use the public IP address of the router that connects the private network with the server on it to initiate a connection. <br>
This router can be programmed so that requests sent to it on a particular port number are forwarded to a device with a specific IP address within the network. This is port forwarding.</p>

<p><strong>Port forwarding</strong> A method of routing data through additional ports</p>


<h3 id="sockets">Sockets</h3>

<p>A network socket is an endpoint of a communication flow across a computer network. Sockets are created in software, not hardware. A TCP/IP socket is made up from the combination of an IP address and a port number. When a computer needs to communicate with a server it will send a request to the server using the server’s IP address and port number for that type of request.</p>

<p><strong>Socket</strong> An endpoint of a communication flow across a computer network</p>

<p>Sockets can be created at any time to enable a network connection to be established to or from a computer. </p>


<h3 id="subnet-masking">Subnet masking</h3>

<p>IP addresses are split into a network identifier and a host identifier. The network could may be a local network, or it could be a remote network and the device could be a computer, printer, or router. Network Ids can also be written with zeros in the parts of the IP address that would be used to identify the host.</p>

<p><strong>Subnet masking</strong> A method of dividing a network into multiple smaller networks</p>

<p>Addresses are split up this way to make networks easier to manage and and to make it more efficient when routing data. For example, in a LAn split across two building, the administrator may find it sueful to allocate IP addresses according to which building a particular computer is in. Where a network is separated in this way, each part is known as a subnet. <br>
Data sent to a particular computer will only travel around the parts of the network that it needs to, making the network more efficient.</p>

<p>When a computer on a network sends data to another computer, it needs to identify whether it is on the same subnet as the other computer. If it is, it can send data to it directly. If it is not, it will send data to the relevant router or gateway, which will in turn send the data to the correct subnet and computer.</p>

<p><strong>Gateway</strong> A node on a network that acts as a connection point to another network with different protocols</p>

<p>In an organisation, a gateway may be used to connect two different companies together. For a home user, a gateway may be used by their ISP to provide access to the Internet.</p>

<p>The gateway carries out all of the protocol conversion required to enable the two networks to work together.</p>

<p>To identify whether the destination computer is on the same subnet, the sending computer needs to look at the network portion of the destinatoin IP address to see if it is the same as its own. To do this, a subnet mask is used. Each device on the subnet is programmed with the same subnet mask. Within the subnet mask, a value of 1 is assigned to all the bits that are part of the network ID, and 0 to all of the parts which identify the host.  <br>
The computer will AND the full IP address of the sending computer, and AND its own full IP address, to calculate their network IDs. If they network IDs are equal, data can be sent directly between them.</p>


<h3 id="ip-address-v4-and-v6">IP address v4 and v6</h3>

<p>As IP v4 was exhausted, it was replaced by IP v6 which uses 128 bits represented as eight groups of four hex numbers separated by colons.</p>

<p>this massively increases the range of numbers available as there are more digits in the number, and hex is being used allowing for a greater range within each group of numbers.</p>


<h3 id="dynamic-host-configuration-protocol-dhcp">Dynamic host configuration protocol (DHCP)</h3>

<p>IP addresses are defined as either static or dynamic. Static IP addresses are ones that are assigned and then never change. Dynamic IP addresses are allocated every time a device connects to a network and this is perhaps the most common approach. <br>
The allocation is done automatically by an application as you log on. </p>

<p><strong>DHCP</strong> A set of rules for allocating locally unique IP addresses to devices as they connect to a network</p>

<p>The application looks for an available IP address from its poll of addresses and allocates it to a device. Where there are hundreds of users logging on and off a network all the time, this is a very efficient system as it means the administrator does not have to do it manually.</p>

<p>A dedicated DHCP server is used on the network and handles the requests by managing a pool of available</p></div></body>
</html>