-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathindex.html
More file actions
8027 lines (6808 loc) · 284 KB
/
index.html
File metadata and controls
8027 lines (6808 loc) · 284 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="en">
<head>
<!-- Hey there! You're looking at the source code! Which is awesome! -->
<!--
( ) ( ) )
) ( ) ( (
( ) ( ) )
_____________
<_____________> ___
| |/ _ \
| | | |
| |_| |
___| |\___/
/ \___________/ \
\_____________________/
You should always sneak some kind of ASCII art into your HTML header.
Fully 0.1% of your site's visitors will appreciate it!
-->
<title>Build A The Web</title>
<meta name="description" content="We're going to learn you how to build a the web" />
<meta name="author" content="Cube Drone" /> <!-- it me -->
<!-- I wrote a whole essay in here about UTF-8. Did you read it? -->
<meta charset="UTF-8"/>
<link href="https://fonts.googleapis.com/css?family=Averia+Serif+Libre:700" rel="stylesheet">
<!-- The CSS stylesheet for the page is located at 'style.css' -->
<link rel="stylesheet preload" type="text/css" media="(min-width: 1000px)" href="style.css">
<link rel="stylesheet" type="text/css" media="(max-width: 1000px)" href="mobile.css" />
<!-- It's a happy icon for the page. :) -->
<link rel="shortcut icon" href="favicon.ico" type="image/x-icon">
<!-- This Javascript converts straight-quotes into proper directional quotes -->
<script src="js/smartquotes.js"></script>
<script>smartquotes()</script>
</head>
<body>
<!--
these DIVs are empty now, but the Javascript that produces the Table of Contents will fill
them with content.
-->
<div id="toc-bar">
</div>
<div id="toc-full">
</div>
<!-- IT BEGINS -->
<div class="main" id="content">
<h1> Build A The Web </h1>
<h2> Introduction </h2>
<!-- One of the things we'll notice in this book is that I try to address us as 'we'
instead of addressing you as 'you'. -->
<!-- It's not "you're going to build a the web", it's "we're going to build a the web" -->
<p>
This is <strong>Build A The Web</strong>, a book devoted to taking us, people with
basic programming and computer knowledge, and upgrading us into web programmers.
I am <a href="https://twitter.com/cube_drone">Cube Drone</a> and I'll be your host.
</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<h4>It Me</h4>
<img src="images/classam.png">
</aside>
<p>
The best kind of textbook about
the web is made out of <abbr title="HyperText Markup Language">HTML</abbr>
and filled with media and links to other content. That is what
the web is <em>all about</em>, not dusty ol' textbooks that
are no longer valid because they cover a version of the web that
went obsolete nine years ago.
</p>
<p>
This book was last updated <strong class='book_last_update'>May 1, 2020</strong>.
If that's a <em>long time ago</em>, I might have died. Avenge me!
</p>
<p>
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a>
<br />This work is licensed under a
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<h4>tl;dr</h4>
<p>
You can share it, you can remix it, but you can't profit off of it.
</p>
</aside>
<p>
This site, and all of the resources used to generate this site, including class materials
and examples, are available at
<a href="https://github.com/cube-drone/buildatheweb">https://github.com/cube-drone/buildatheweb</a>.
That's also a good place to comment, suggest updates, file bugs, and generally
improve the book if you feel that it needs some attention.
</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<p>
There are a few images included for comedic effect, here, that I do not own,
and thus cannot make available under the same license as the rest of the work.
</p>
<!-- tl;dr: I do not own Seinfeld or The Simpsons for some reason -->
<p>
The copyright implications of "a screenshot of someone else's work" are still
pretty legally fuzzy.
</p>
<p>
The copyright implications of "a screenshot from an episode of The Simpsons"
are even more so.
</p>
</aside>
<h3>What are we going to learn?</h3>
<p>
Becoming a web developer involves a lot of skills! This handy <a href="https://github.com/kamranahmedse/developer-roadmap">Web Developer Roadmap</a>
lays them out in a way that's pretty sensible, and we're going to try our best to carve a path through many of these skills that
leaves us ready to build some cool websites.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<h4>Halp</h4>
<p>
I'm starting this book assuming that we already know how to program a little bit!
</p>
<p>
To learn to program, I recommend this <em>other</em> open textbook:
<a href='http://openbookproject.net/thinkcs/python/english3e/'>How To Think Like a Computer Scientist</a>.
</p>
<p>
If you can't code, you can still make it as far as the chapter on Javascript before you're going
to start to get in over your head.
</p>
<p> (the chapter on JavaScript has been cancelled) </p>
</aside>
<p>
<strong>Okay, let's get started. Are you ready?</strong>
</p>
<h2> Chapter 1: Request, Response </h2>
<p>
Our first chapter is going to focus on what happens behind the scenes when we make
a web request. What that means is a crash course in computer networking.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<h4>Crash Course</h4>
<img src="images/chapter1/crash_course.png">
<p>
For a crash course in everything else in Computer Science, there's always
<a href='https://www.youtube.com/watch?v=tpIctyqH29Q&list=PL8dPuuaLjXtNlUrzyH5r6jN9ulIgZBpdo&index=1'>Crash Course Computer Science</a>.
</p>
</aside>
<h3> Uniform Resource Locators </h3>
<p>
Let's start by looking at what happens when we crack open a web browser and
type in:
</p>
<pre class='code'>https://en.wikipedia.org/w/index.php?title=Blinkenlights&action=edit#Etymology</pre>
<p> This is a <abbr title="Uniform Resource Locator">URL</abbr>, which stands for <strong>Uniform Resource Locator</strong>.</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<h4>URI</h4>
<p>
Some people might also call this an
<abbr title="Uniform Resource Identifier">URI</abbr>, but if we encounter a person who does this,
I recommend that we shun then socially until they stop. </p>
</aside>
<p>
This <abbr title="Uniform Resource Locator">URL</abbr> uniquely identifies a document somewhere on someone else's
computer that we are going to request from that computer.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<h4>Abbreviations</h4>
<p>
If you hold your mouse over an abbreviation like
<abbr title="Uniform Resource Locator">URL</abbr> for a few seconds,
the definition of that term should pop up.
</p>
</aside>
<p>
This divides into protocol, domain name, path, parameters,
fragment, locus, and spindle. Memorize all of these terms now.
</p>
<div class='figure'>
<h3> Dissecting a URL </h3>
<h4>Protocol</h4>
<p class='code'>
<strong>https</strong>://en.wikipedia.org/w/index.php?title=Blinkenlights&action=edit#Etymology
</p>
<p>
The protocol describes <strong>how</strong> to connect.
</p>
<h4>Domain Name</h4>
<p class='code'>
https://<strong>en.wikipedia.org</strong>/w/index.php?title=Blinkenlights&action=edit#Etymology
</p>
<p>
The domain name describes <strong>where</strong> to connect to.
</p>
<h4>Path</h4>
<p class='code'>
https://en.wikipedia.org<strong>/w/index.php</strong>?title=Blinkenlights&action=edit#Etymology
</p>
<p>
The path describes <strong>what</strong> is being requested.
</p>
<h4>Parameters</h4>
<p class='code'>
https://en.wikipedia.org/w/index.php<strong>?title=Blinkenlights&action=edit</strong>#Etymology
</p>
<p>
The parameters describes <strong>extra arguments</strong> for the thing being requested.
</p>
<h4>Fragment</h4>
<p class='code'>
https://en.wikipedia.org/w/index.php?title=Blinkenlights&action=edit<strong>#Etymology</strong>
</p>
<p>
The fragment describe a specific <strong>part</strong> of the document that
we want to look at.
</p>
<h4>Locus & Spindle</h4>
<p class='code'>
the locus and spindle were in your <strong>heart</strong> the entire time
</p>
</div>
<p>This is just a bunch of words, describing where we think our document is. How do we actually
<strong>get</strong> that document?</p>
<h3> Transport Control Protocol / Internet Protocol </h3>
<p> Communicating with a far away computer is a process fraught with interesting problems. Problems like: </p>
<ul>
<li>How do we make sure that messages reliably arrive?</li>
<li>How do we make sure that messages arrive in the right order?</li>
<li>How does our computer communicate with our router?</li>
<li>How does our router communicate with our modem?</li>
<li>How do we get messages to travel across a thin strip of copper, or fiberoptic cable, at all?</li>
</ul>
<p>
Most of these problems are quite complicated — and the lower level we get, the more likely it is
that we'll need to consult an electrical engineer to explain signal processing theory to us.
Believe me, that is the <em>last</em> thing that we want.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<img src="images/chapter1/math.gif">
</aside>
<p>
Fortunately, smart people have already solved most of these problems for us.
The solutions to these problems stack up on top of another — at the bottom, electrical engineers
figuring out how to send messages across wires, at the top, math PhDs figuring out how to make sure
that messages arrive reliably in a fixed order.
</p>
<p>
We have two protocols at the very top of the stack that define how we communicate between computers —
<abbr title="Internet Protocol">IP</abbr>,
the <strong>Internet Protocol</strong>, which defines how we send messages across the network,
and <abbr title="Transport Control Protocol">TCP</abbr>, the <strong>Transport Control Protocol</strong>, which makes sure that
our messsages completely arrive, in the right order, and uncorrupted.
</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<h4>UDP</h4>
<p>
<abbr title="Transport Control Protocol">TCP</abbr> is often presented in relation to
<abbr title="User Datagram Protocol">UDP</abbr>, the User Datagram Protocol, which is a protocol
that is <em>also</em> built on top of the Internet Protocol.
</p>
<p>
<abbr title="User Datagram Protocol">UDP</abbr> doesn't worry about
error correction the way that TCP does — it's commonly used for applications like voice where
any extra overhead could hurt the performance of the underlying application.
</p>
</aside>
<h3> Protocol & Stack</h3>
<p>
I'm going to say the word <strong>protocol</strong> a lot, and it's
probably important that I establish what that means. In Computing Science,
we learn the difference between an algorithm and a program —
an algorithm describes a specific way of solving a problem,
whereas a program is the actual code that we need to run the algorithm.
We could have five different programs, all implementing the same algorithm.
</p>
<p> <strong>A protocol is an algorithm for communication</strong>. It delineates the rules of communication. </p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<img src="images/chapter1/c3po.gif">
<p>
<strong>C–3P0</strong> was a protocol droid, because he was programmed with all of the rules
for communicating with the various different cultures in the Galaxy Far Far Away.
</p>
<p> Canonically, C-3P0 was also <em>not very good at this</em>. </p>
</aside>
<p>
In the same way that a program is an implementation of an algorithm, a
<strong>stack</strong> is an implementation of a protocol.
A protocol is an algorithm, and a stack is a program to implement a protocol.
So, in order to run
<abbr title="Transport Control Protocol/Internet Protocol">TCP/IP</abbr>,
our computer runs the
<abbr title="Transport Control Protocol/Internet Protocol">TCP/IP</abbr> stack,
which implements the
<abbr title="Transport Control Protocol/Internet Protocol">TCP/IP</abbr> protocol. </p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<p>
It has come to my attention that computer people use the word "stack" too often.
</p>
<p>
I keep trying to advance "pile" as an alternative, but it's not taking off.
</p>
</aside>
<h3> IP Address & Sockets </h3>
<p>
The abstraction presented by <abbr title="Transport Control Protocol/Internet Protocol">TCP/IP</abbr>
is simple: every computer has an <abbr title="Internet Protocol">IP</abbr> address.
An <abbr title="Internet Protocol">IP</abbr> address looks like this: <code>192.0.2.0</code> — or, like this:
<code>2001:DB80:c501:17ef:a063:a37f:3803:5c1a</code>. These are just
identifiers that communicate a unique identity for the computer in question.
</p>
<p>
If we know our <abbr title="Internet Protocol">IP</abbr> address, and the
<abbr title="Internet Protocol">IP</abbr> address of the computer that we want to
communicate with, and that other computer is online, we can trust that
<abbr title="Internet Protocol">IP</abbr> will
get the message to that computer. We can use
<abbr title="Transport Control Protocol">TCP</abbr> to open a <strong>socket</strong> to a specific
<strong>port</strong> on our target computer.
</p>
<div class='figure'>
<h3>Giant Walls of Plugs</h3>
<img src="images/ports.png" alt="A giant wall of plugs"/>
<p>
Imagine these computers like giant walls of plugs — or ports — and when we open a connection,
<abbr title="Transport Control Protocol">TCP</abbr>
creates a two way communication link between two ports with a socket on each end.
</p>
</div>
<p>
There are thousands of these ports — they're numbered from 1 to 65,535. In order to keep things tidy,
each different protocol that runs on top of
<abbr title="Transport Control Protocol">TCP</abbr> usually runs on a different port.
Of course, most computers aren't communicating on all of these ports at once —
in fact, there are six ports that, on most computers, get more use than all
of the rest of them combined:
</p>
<ul>
<li><strong>25</strong>, for the Simple Mail Transfer Protocol (<abbr title="Simple Mail Transfer Protocol">SMTP</abbr>) </li>
<li><strong>53</strong>, for the Domain Name System (<abbr title="Domain Name System">DNS</abbr>) </li>
<li><strong>67</strong> and <strong>68</strong> for the Dynamic Host Configuration Protocol (<abbr title="Dynamic Host Configuration Protocol">DHCP</abbr>) </li>
<li><strong>80</strong> for the HyperText Transfer Protocol (<abbr title="HyperText Transfer Protocol">HTTP</abbr>)</li>
<li><strong>443</strong>, for the HyperText Transfer Protocol over Transport Layer Security (<abbr title="HyperText Transfer Protocol over Transport Layer Security">HTTPS</abbr>)</li>
</ul>
<p> We're going to cover all of these protocols in detail at one point or another. <small><em>editor's note: no, we're not</em></small></p>
<p> For every port that we could contact on a remote computer, there's a program on the remote
computer that's running, listening, and potentially responding to our requests. </p>
<h4> Server vs. Client </h4>
<p>
A computer that stays connected to the internet all of the time and responds to these
requests is called a server, and server programming is half of the battle of web programming.
The second half is client programming, which describes the parts
of the transaction that occur on the customer's side of things.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside small-aside aside'>
<p>
The third half of the battle is <strong>lasers</strong>.
</p>
</aside>
<h3> DHCP </h3>
<p>
In order to communicate with a remote server, then, we need three things —
our <abbr title="Internet Protocol">IP</abbr> address, their <abbr title="Internet Protocol">IP</abbr> address,
and a port number, to communicate with.
</p>
<p>
First of all, let's talk about how we got <strong>our <abbr title="Internet Protocol">IP</abbr> Address</strong>.
</p>
<p>
Well, the short answer is, our computer already knows its <abbr title="Internet Protocol">IP</abbr> address. We just ask.
</p>
<p>
How did our computer get its <abbr title="Internet Protocol">IP</abbr> address?
Well, when we connected it to the router —
either via <abbr title="Wireless Local Area Networking">WiFi</abbr>
or by plugging it in — it communicated to the router, using
<abbr title="Dynamic Host Configuration Protocol">DHCP</abbr>, the <strong>Dynamic Host Configuration Protocol</strong>, where
our computer asked the router to assign it an <abbr title="Internet Protocol">IP</abbr> Address.
</p>
<p>
Then, how did the router that gave us an
<abbr title="Internet Protocol">IP</abbr> address
get its own <abbr title="Internet Protocol">IP</abbr> address?
Well, when we connected the router to the
internet, either by plugging it in to a modem or by plugging it in to another link in
the network, it also communicated to a router, and communicated using
<abbr title="Dynamic Host Configuration Protocol">DHCP</abbr>.
It asked
<em>"What is my IP?"</em>.
</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside small-aside aside'>
<p>
How did that router get its <abbr title="Internet Protocol">IP</abbr>? Magic. It was magic.
</p>
</aside>
<h3>Network Address Translation</h3>
<p>
If we ask our computer to tell us its
<abbr title="Internet Protocol">IP</abbr> address, it'll probably report something
that starts with <code>10.0</code> or <code>192.168</code> — but, if we go to
Google and ask <em>"what is my IP?"</em>,
it'll tell us a completely different
<abbr title="Internet Protocol">IP</abbr> address.
</p>
<p>
What gives? How can our computer have more than one
<abbr title="Internet Protocol">IP</abbr> address?
</p>
<p>
As part of our deal with our <abbr title="Internet Service Providers">ISPs</abbr>,
we usually get just one <abbr title="Internet Protocol">IP</abbr> address.
Just the one. Presumably, we have more than one device in our home —
a computer, a cel phone,
a laptop, a second computer, a smart <abbr title="Television">TV</abbr>,
a toaster that connects to the internet for some reason,
a third computer, a toothbrush that connects to the internet for some reason,
things have really gone out of control lately.
</p>
<p>
All of these devices need to share the one <abbr title="Internet Protocol">IP</abbr> address, so, our router creates a little private
network, just for us, in our home. In this private network, any computer can have any
<abbr title="Internet Protocol">IP</abbr> address
that it wants. By convention, the <abbr title="Internet Protocol">IP</abbr>
addresses for use in private networks start with <code>192.168</code>
or <code>10.0</code>. Then, when we're connecting to the outside world, our router translates
our IP address in the private network into our public
<abbr title="Internet Protocol">IP</abbr> address.
</p>
<p>
This is <abbr title="Network Address Translation">NAT</abbr>,
<strong>Network Address Translation</strong>.
</p>
<div class='figure'>
<p>
The router creates a private network for all of our devices and
assigns them local <abbr title="Internet Protocol">IP</abbr> addresses.
</p>
<img src='images/nat_1.png' alt="a diagram of a private network">
<p>
When our devices make requests to the internet, the router
translates them into the public-facing <abbr title="Internet Protocol">IP</abbr> address.
</p>
<img src='images/nat_2.png' alt="a diagram of a request being made through the router">
<p>
When the internet responds, the router remembers who requested
the content and forwards the response back to that
<abbr title="Internet Protocol">IP</abbr> address.
</p>
<img src='images/nat_3.png' alt="a diagram of the router returning the request's response to the source computer">
</div>
<h3>Domain Name System</h3>
<p>
In order to communicate with a server, we need both our own
<abbr title="Internet Protocol">IP</abbr> address and the
<abbr title="Internet Protocol">IP</abbr> address
of the computer that we want to communicate with.
</p>
<p>
We have our own <abbr title="Internet Protocol">IP</abbr> address —
now we need to find the address of the server that we
want to talk to.
</p>
<p> Let's look at the link we're trying to access. </p>
<p class='code'>
https://<strong>en.wikipedia.org</strong>/w/index.php?title=Blinkenlights&action=edit#Etymology
</p>
<p>
There's no <abbr title="Internet Protocol">IP</abbr> address anywhere in this link. There <strong>is</strong> a domain name,
<code>en.wikipedia.org</code>
</p>
<p>
In order to find the <abbr title="Internet Protocol">IP</abbr> address for this server, we're going to have to start by consulting
a <abbr title="Domain Name System">DNS</abbr> server.
<abbr title="Domain Name System">DNS</abbr> stands for <strong>Domain Name System</strong>,
and the process of converting a domain name into an
<abbr title="Internet Protocol">IP</abbr> address is called
<strong>name resolution</strong>.
</p>
<p>
How do we know where the
<abbr title="Domain Name System">DNS</abbr> server is? When we use
<abbr title="Dynamic Host Configuration Protocol">DHCP</abbr> to connect to
<abbr title="Wireless Local Area Networking">WiFi</abbr>, it
also provides us with the
<abbr title="Internet Protocol">IP</abbr> address of the nearest
<abbr title="Domain Name System">DNS</abbr> server, which is usually being maintained
by our
<abbr title="Internet Service Provider">ISP</abbr>. Acronyms!
</p>
<div class='figure'>
<h3>Turkish Protesters</h3>
<img src="images/chapter1/turkish_dns.jpg" alt="Turkish protestors have written Google's DNS information, in spraypaint, on a building">
<p>
Google also maintains a public
<abbr title="Domain Name System">DNS</abbr>
server at the address <code>8.8.8.8</code>, which is good to know
in case our local
<abbr title="Domain Name System">DNS</abbr> server ever goes down or is
<a href="https://mic.com/articles/85987/turkish-protesters-are-spray-painting-8-8-8-8-and-8-8-4-4-on-walls-here-s-what-it-means#.x2HgXBndh">interfered with by a totalitarian government</a>.
</p>
</div>
<p>
So our computer sends a request to the
<abbr title="Domain Name System">DNS</abbr> server,
asking where to find <code>en.wikipedia.org</code>.
</p>
<p>
If the server already knows where <code>en.wikipedia.org</code> is, then it responds with the
IP address. Let's imagine, though, that the server doesn't know.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside small-aside aside'>
<h3>howdns.works</h3>
<a href="https://howdns.works/"><img src="images/dns_root.png" alt="an example image from howdns.works"></a>
<p> The best way to understand
<abbr title="Domain Name System">DNS</abbr> is to visit
<a href='https://howdns.works/'>howdns.works</a>,
where they illustrate the protocol with fun cartoons. </p>
<p> The second best way is by watching this <a href="https://www.youtube.com/watch?v=Rck3BALhI5c">fast-talking con-man</a>
describe it in detail.
</p>
</aside>
<p>
<abbr title="Domain Name System">DNS</abbr> Root servers are distributed all over the globe, and they keep track of
exactly one thing: the <abbr title="Internet Protocol">IP</abbr> addresses of the computers reponsible for the
recordkeeping of top level domains, like
<ul>
<li><code>.com</code></li>
<li><code>.net</code></li>
<li><code>.org</code></li>
<li><code>.photo</code></li>
<li><code>.click</code></li>
<li><code>.ninja</code></li>
<li><code>.unicorn</code></li>
<li><code>.fun</code></li>
<li><code>.ooo</code></li>
<li><code>.plumbing</code></li>
<li><code>.oh my god top level domains are just getting dumber and dumber</code></li>
</ul>
</p>
<p>
So, the <abbr title="Domain Name System">DNS</abbr> server looks at the domain name we've given it —
<code>en.wikipedia.org</code> —
sends a request to the root server, and asks
<em>"which servers can I ask about .org records?"</em>
</p>
<p>
The root server will reply with a list of <abbr title="Internet Protocol">IP</abbr> addresses responsible
for .org records. These are the addresses of <strong>Top Level Domain Servers</strong>,
which are maintained by <strong>Domain Registrars</strong>.
We can pay these people about ten <abbr title="United States">US</abbr> dollars a year to
create and maintain a record for us,
so long as nobody else has claimed that domain name already.
For a pittance, I now own <a href="http://lassam.net">http://lassam.net</a>.
</p>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<p>
Some people claim that the <strong>most important tool in web programming</strong>
is a command line or a text editor,
but I will maintain that it is a <strong>credit card</strong>.
</aside>
<p>
Finally, the <abbr title="Domain Name System">DNS</abbr> server queries the wikipedia nameservers, asking them
where they can find <code>en</code>.
If we paid a registrar to put up a domain name for us,
they'll usually throw in a nameserver for free —
all we have to do is write some <abbr title="Domain Name System">DNS</abbr>
rules that tell the nameserver what
<abbr title="Internet Protocol">IP</abbr> address we want to point at.
</p>
<p>
<abbr title="Domain Name System">DNS</abbr> rules are written in a cryptic language that contains records
with names like <code>A</code>, <code>MX</code>, <code>AAAA</code>,
and <code>AAAAAAAAAAAAAAAAHH SPIDERS</code> —
I'm sorry, there was a spider next to the keyboard.
</p>
<p>
So, Wikipedia's nameservers report that en.wikipedia.org is
located at, say, <code>203.0.113.98</code>.
Finally, after that entire protracted process, we know where wikipedia is.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside small-aside aside'>
<p>
Because I wrote this in the <em>past</em>, that's bound
to change and has definitely changed already.
</p>
</aside>
<h3>HyperText Transfer Protocol</h3>
<p>
Our next step is to use <abbr title="Transport Control Protocol/Internet Protocol">TCP/IP</abbr> to create a connection between our
<abbr title="Internet Protocol">IP</abbr> address,
and the <abbr title="Internet Protocol">IP</abbr> address that we just resolved from
<abbr title="Domain Name System">DNS</abbr>.
</p>
<p>
There's only one thing left that we need — a port number.
We also didn't specify a port number as part of the
<abbr title="Uniform Resource Locator">URL</abbr>,
but we did specify a protocol,
<abbr title="HyperText Transfer Protocol over Transport Layer Security">HTTPS</abbr> —
the <strong>HyperText Transfer Protocol feat. Transport Layer Security</strong> —
and when we specify a protocol without a port number,
our connection automatically goes to the default port for that protocol.
In the case of <abbr title="HyperText Transfer Protocol over Transport Layer Security">HTTPS</abbr>,
that's <strong>443</strong>.
</p>
<p>
The <strong>HyperText Transfer Protocol (<abbr title="HyperText Transfer Protocol">HTTP</abbr>)</strong> is the protocol responsible
for moving documents around. Request a document? Get a document.
The rules for that are laid out in the HyperText Transfer Protocol,
which is the protocol that powers pretty much the entire web as we know it.
</p>
<p>
Our <abbr title="Uniform Resource Locator">URL</abbr>'s
protocol is <abbr title="HyperText Transfer Protocol over Transport Layer Security">HTTPS</abbr>, though, not just
<abbr title="HyperText Transfer Protocol">HTTP</abbr>. The difference is slight
but important —
<abbr title="HyperText Transfer Protocol over Transport Layer Security">HTTPS</abbr>
is the same as
<abbr title="HyperText Transfer Protocol">HTTP</abbr>, but over a connection encrypted
with <strong>Transport Layer Security
(<abbr title="Transport Layer Security">TLS</abbr>)</strong>.
This prevents J. Random Hacker from watching every
<abbr title="HyperText Transfer Protocol">HTTP</abbr> request that goes by.
</p>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<p>
Transport Layer Security is a form of
<a href="https://www.youtube.com/watch?v=jkV1KEJGKRA">End to End Encryption</a>.
</p>
</aside>
<div class='left-aside-arrow'> </div>
<aside class='left-aside aside'>
<p>
<strong>J. Random Hacker</strong> is an arbitrary
programmer.
<blockquote cite="http://www.catb.org/jargon/html/appendixb.html">
A mythical figure like the Unknown Soldier; the archetypal hacker nerd.
<a class='citation' href="http://www.catb.org/jargon/html/appendixb.html">The Jargon File</a>
</blockquote>
</p>
</aside>
<p>
Once we've created this encrypted communication path between our computer
and the faraway server, we need to construct a
<abbr title="HyperText Transfer Protocol">HTTP</abbr> Request. It'll look
something like this:
</p>
<pre class='code'>
GET /w/index.php?title=Blinkenlights&action=edit#Etymology HTTP/1.1
Host: en.wikipedia.org</pre>
<p>
This is a request to <code>GET</code> whatever's at the path
of the <abbr title="Uniform Resource Locator">URL</abbr> we provided to our browser.
</p>
<p>
It also includes <strong>Headers</strong> with the request — sets of key
and value that communicate extra information to the server. In this case,
the only header we've included is "Host".
</p>
<p>
The server will receive this request, and respond with a
<abbr title="HyperText Transfer Protocol">HTTP</abbr> Response
containing the sweet webpage we've been looking for this entire time.
</p>
<p>
Virtually everything in web programming happens in the space between
the <abbr title="HyperText Transfer Protocol">HTTP</abbr> request and the
<abbr title="HyperText Transfer Protocol">HTTP</abbr> response. Figuring out how to respond,
quickly,
with the right stuff is the meat and potatoes of web programming.
This bit, right here. It's all the marbles. Empires have risen
and fallen, all dependant on the simple gap of how a server converts
this <abbr title="HyperText Transfer Protocol">HTTP</abbr> request into a
<abbr title="HyperText Transfer Protocol">HTTP</abbr> response.
</p>
<p>
And then, Wikipedia responds. The full
<abbr title="HyperText Transfer Protocol">HTTP</abbr> response is several
pages long, we can look at it <a href='wikipedia_http_response'>here</a>.
In order to keep my book neat and tidy, though, I'm going to
concoct a fake response for the sake of example:
</p>
<pre class='code'>
HTTP/1.1 200 OK
Content-language: en
Content-type: text/html; charset=UTF-8
X-Clacks-Overhead: GNU Terry Pratchett
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Definitely Wikipedia</title>
</head>
<body>
<h1>This is totally Wikipedia.</h1>
<p> Hi there. I am Bob Wikipedia and you are at my website.
It’s still under construction but I am pretty sure it will be done by 1998.</p>
<img src=”https://media0.giphy.com/media/K5Yn9JCXcrXr2/giphy.gif”>
</body>
</html></pre>
<p>
It opens with the version of
<abbr title="HyperText Transfer Protocol">HTTP</abbr> that's currently running,
as well as a <strong><abbr title="HyperText Transfer Protocol">HTTP</abbr> Status Code</strong>.
So long as the <abbr title="HyperText Transfer Protocol">HTTP</abbr> Status Code is <code>200 OK</code> we're good to go.
</p>
<div class='figure'>
<h3>A Brief And Mostly Inaccurate Reference Chart for HTTP Status Codes</h3>
<img src="images/247.png" alt="a comic containing HTTP status codes">
<p>
A more accurate look at <abbr title="HyperText Transfer Protocol">HTTP</abbr> status codes can be found
<a href="https://en.wikipedia.org/wiki/List_of_HTTP_status_codes">here</a>.
</p>
<p>
Alternatively, pictures of cats matching every
<abbr title="HyperText Transfer Protocol">HTTP</abbr> status code exist
at <a href="http://http.cat">http.cat</a>.
</p>
</div>
<p>
After the
<abbr title="HyperText Transfer Protocol">HTTP</abbr>
version and status code, there are <strong>Headers</strong>
again. These headers describe important properties of the file that's
been returned.
</p>
<pre class='code'>
Content-language: en
Content-type: text/html; charset=UTF-8
X-Clacks-Overhead: GNU Terry Pratchett</pre>
<p>
Most headers are defined in the <a href='https://tools.ietf.org/html/rfc2616'>protocol itself</a>,
but we can add
any headers that we want by starting them with <code>X-</code>,
which is how we can sneak in <a href='http://www.gnuterrypratchett.com/'>the clacks</a>.
</p>
<p>
After the <abbr title="HyperText Transfer Protocol">HTTP</abbr> version and status code, we get into a big patch
of <abbr title="HyperText Markup Language">HTML</abbr>.
How do we know that it's
<abbr title="HyperText Markup Language">HTML</abbr>? Well, the <code>Content-type</code>
header referred to this as <code>text/html</code>, so we can be
pretty sure that we've got a big handful of
<abbr title="HyperText Markup Language">HTML</abbr>.
</p>
<h3> Hypertext Markup Language </h3>
<p>
I'm going to do a dramatic reading of a passage from
"In The Beginning Was The Command Line".
Despite being almost 20 years old, it's a stonkingly accurate
diatribe on computer culture and a fun historical record from
<em>the before times</em>.
</p>
<blockquote cite="http://artlung.com/smorgasborg/C_R_Y_P_T_O_N_O_M_I_C_O_N.shtml">
<p>
This crud is called <strong>HTML (HyperText Markup Language)</strong>
and it is basically
a very simple programming language instructing your web browser how
to draw a page on a screen. Anyone can learn HTML and many people do.
The important thing is that no matter what splendid multimedia
web pages they might represent, HTML files are just telegrams.
</p>
<p>
When Ronald Reagan was a radio announcer, he used to call baseball
games by reading the terse descriptions that trickled in over the
telegraph wire and were printed out on a paper tape. He would sit there,
all by himself in a padded room with a microphone, and the paper
tape would eke out of the machine and crawl over the palm of
his hand printed with cryptic abbreviations. If the count went
to three and two, Reagan would describe the scene as he saw it
in his mind's eye: "The brawny left-hander steps out of the
batter's box to wipe the sweat from his brow. The umpire steps
forward to sweep the dirt from home plate." and so on.
When the cryptogram on the paper tape announced a base hit,
he would whack the edge of the table with a pencil, creating
a little sound effect, and describe the arc of the ball as if
he could actually see it. His listeners, many of whom presumably
thought that Reagan was actually at the ballpark watching the game,
would reconstruct the scene in their minds according to his descriptions.
</p>
<p>
This is exactly how the World Wide Web works: the HTML files are
the pithy description on the paper tape,
and your Web browser is Ronald Reagan.
<p>
<a class='citation' href='http://artlung.com/smorgasborg/C_R_Y_P_T_O_N_O_M_I_C_O_N.shtml'>
Neal Stephenson, "In The Beginning Was The Command Line"
</a>
</blockquote>
<div class='right-aside-arrow'> </div>
<aside class='right-aside aside'>
<p>
As of 2004, only 5 years after the original publication,
Neil Stephenson regarded it as badly out of date.
</p>
<p>
Another author, with permission, published an
<a href="http://garote.bdmonkeys.net/commandline/index.html">annotated version</a>.
</p>
<p>
The annotations are now even more badly out of date.
</p>
</aside>
<p>
So, the <abbr title="HyperText Markup Language">HTML</abbr>
that Wikipedia has returned to us contains a description of
the content that we're looking at, and then our web browser
renders it into a webpage.
</p>
<p>
It would seem like, here, our journey is complete.
We've made a round trip between our device and the server, and we're done
— but, not quite! When the browser renderer gets to this part of the
<abbr title="HyperText Markup Language">HTML</abbr>:
</p>
<pre class='code'>
<img src=”https://media0.giphy.com/media/K5Yn9JCXcrXr2/giphy.gif”/></pre>
<p>
This image tag references content that exists at another
<abbr title="Uniform Resource Locator">URL</abbr>.
And so, we kick off this entire process again, from start to finish,
to get whatever it is at that new address. The trick is, though,
instead of sending us
<abbr title="HyperText Markup Language">HTML</abbr>,
this server will respond with an animated image.
Modern web-pages may require dozens of requests to various images
and scripts before they're completely rendered.
</p>
<h3> Chapter 1 Summary</h3>
<p>Every browser request–response goes a little something like this: </p>
<ul>
<li>
Our browser uses <abbr title="Domain Name System">DNS</abbr> to resolve the
<abbr title="Uniform Resource Locator">URL</abbr>'s
domain into an <abbr title="Internet Protocol">IP</abbr> address.
</li>
<li>
Our browser uses <abbr title="Transport Control Protocol">TCP</abbr>
to create a two-way connection with the
server at that <abbr title="Internet Protocol">IP</abbr> address.
</li>
<li>
If the URL's protocol is <abbr title="HyperText Transport Protocol over Transport Layer Security">HTTPS</abbr>,
then a
<abbr title="Transport Layer Security">TLS</abbr> connection is made to the server.
</li>
<li>
Our browser sends a <abbr title="HyperText Transport Protocol">HTTP</abbr>