Lecture 5: The Internet, continued
(Begin with film on TCP/IP) "He's got your address"
Before we begin into what TCP/IP is and what its all about, we have a raffle!
Raffle for students who got 75% or better on PSet1
First item: Apple Tshirt!
Second item: iPod shuffle (first generation)
Google Earth is just too cool to not revisit
You will be tasked in PSet 4 to use Google Earth for a couple of questions
Find popular tourist attractions somewhere in the world
Find places of personal interest
This is a wonderful opportunity to appreciate that you don't need training to use new software (or you shouldn't!).
Ideally, you shouldn't need certifications or training to get around an application
Its probably a fault of the application and not the user if its too difficult to use
Lets go right away to 1 Oxford St (where the lectures are held)
Can specify US address, foreign address, often places such as 'Eiffel Tower'
One of the controls on the right side of the screen is a plus and minus sign, this lets you zoom in and out.
You can move horizontally, vertically using the arrows in the same location.
You can also click-and-drag to move around in the world.
Above the horizontal/vertical control is a tilt bar that allows you to change the tilt of the Earth.
A lot of fun features of the program
Terrain is a fun one to turn on - where Google Earth has height information it will show the proper height when you tilt the view
You can also find food under 'dining', hotels under 'lodging', etc.
The point of this demonstration is to 'wow'
When you find a place you want to save, you click on the thumbtack icon and it places a thumbtack icon on the map
This is a "placemark"
Placemarks can be saved and submitted to friends (and to us for the Pset)
Placemarks are stored in the left column in the middle box.
In this box, create a new folder (one for each question, 3 and 4) and drag the placemarks for your problem set into these folders.
When you're all done, right-click (Windows) or click-and-hold (Mac) the folder with your placemark and click "Save As"
Save the file and submit it.
When you submit it to us we will be able to load it into our own Google Earth and be whisked away to your chosen locations.
See the lecture video for a short demonstration
"Sightseeing" placemark - a Google default
Grand Canyon shows the terrain layer quite well. (Make sure the Terrain layer is enabled)
Boston, MA. Enable the "3d buidings" layer to show virtual models of buildings in downtown Boston. You can then tilt and see what the city looks like in a 3d perspective. Also, enable the "Transportation" layer and it will show you the T lines.
Class requests: London! Big Bend National Park!
Pset 4 is due in early November
Not necessarily because its difficult, but because its involved
You will be asked to go into a computer store and "purchase" a computer with "virtual" dollars
Last week we focused on the services and what you can do on the Internet
Let's take a step back: are the Internet and the World Wide Web the same thing?
The Internet is an infrastructure, a framework, a backbone on which applications and services run
When we talk about the WWW we talk about something you can do on the Internet.
Tonight is about lifting up the hood of the Internet. How it works, why it breaks.
Hopefully you will be able to use technical words in the correct order so that if you have to call Comcast you will be able to get better help faster.
Last week we talked about the smallest possible network that's interesting: a peer-to-peer, a connection between two computers
The term is now used more generally, but it still means the same thing - a connection between two computers.
Client-server connection, where a 'host' computer serves the client machine (similar to the restaurant analogy)
What do we do if we have to add a 3rd computer to a peer-to-peer connection?
Rather than connecting them all together, we could connect them all to a central computer, or a server
The server could figure out which data needs to be sent to which computer
Similarly we can just have a central device where all computers are connected
Most network connections now use ethernet, RJ-45, a connection that appears to be a wide telephone jack (a telephone connector is RJ-11)
Most computers come with only one ethernet jack, so if you connect two computers together directly you are unable to let your network grow.
This is why we use a central location, which then has many jacks, to connect all the machines
The central device is a switch, or a router
Switch: relatively cheap and relatively dumb device that receives data from one computer, determines which computer it goes to, and sends it to another computer
Switches aren't completely dumb, they don't just spew all information they receive to all connected computers
We differentiate from routers because now-a-days most devices called a 'router' now has so many devices rolled into one: a switch, a router, an access point, a firewall, etc
These are relatively cheap, being about $20
Most common way to connect computers is Star topology
Several years ago, the Ring approach was common
software was written for this
wiring used to be expensive and this minimized the amount of cable required
Another method is a Bus network
Next time you're in your office ask your IT guy to see the server room, or the wiring closet.
What you will likely see is a huge box that can connect 24, 48, maybe even 100 computers together
In section, we will actually create ethernet cables!
Not the cabling itself, but we will attach the head of a cable to a raw ethernet cable
Involves aligning 8 tiny wires in a certain pattern and attaching the connector to it
any of you who have an internet connection at home or at work use one of these cables
How do you go about connecting LANs and WANs?
ARPANET was a military project that had redundant connections between computers and is the precursor to the Internet
A number of companies such as MCI, AT&T, etc (a lot of the old Telco companies) own vast systems of cabling that can transmit gigabytes/terabytes of data across the country, trans-Atlantic and trans-Pacific
Many cables are fiber optic as oposed to the copper wiring that we find in ethernet cables
Fiber optic cable allows the passage of light (rather than electrons) so that data transmission is even faster.
Speeds 'n stuff:
Cat-5 is capable of speeds at 100Mbps
that's 100 million bits per second!
Fiber optic capable of gigabit range
Fiber optic cable is already useful: much of Harvard's network inter-building connections use fiber optic cabling
What about renovating residential condo?
Its probably premature to wire a residential home with fiber optic
How fast is a typical cable modem connection?
1.5Mbps - 8Mbps
compare to Cat-5's capability of 100Mbps
Tends to be more expensive
DSL modem typical speed
768kbps - few Mbps
$15-35/month or so (tends to be less than cable modem)
Even these connection numbers are misleading, download speed can be very fast but upload (sending data) can be much slower
Maxiumum upload speed: "much slower!": 768kbps
Why is it ok for the typical user to have this asymmetry?
You tend to download far more than you actually upload
You're sending very little requests and getting a relatively large amount of data in return.
FiOS - Fiber Optic Service
If you have this, we are jealous!
Fiber optic technology for delivering the Internet to your home
30Mbps download, 5Mbps upload
With a central switch/router in your home, the connection on your local area network (the network between the computers in your home) is much faster than the connection to the Internet.
So, generally the bottleneck when connecting to the Internet is between your network and the service provider.
Why might you want a really fast connection locally (in your local network) even if your connection to the Internet is not as fast?
So you can transfer files to other computers on your network at a much faster rate
You might have wireless access in your home
Technologies such as 802.11b and 802.11g make this possible
If you have a laptop or a desktop with wireless capabilities you likely have a machine capable of one or both technologies
signal strength dependent
depends on how strong a signal you have, the farther away you are from the access point or if there are obstructions this will go down
11Mbps is considerably slower than the 100Mbps of Cat-5, but why is this OK?
Internet connections generally aren't much faster than this anyway (look at Comcast's max download speed: 8Mbps)
also signal strength dependent
even with 1/2 speed due to obstructions, it is still enough for some of the fastest consumer Internet connections.
Wifi - wireless fidelity
WiMAX - a nice experiment that will try to give fast wireless access to an entire municipality
Point of presence, peering point: areas of the Internet where major Internet connections flow into and out of.
We already know that data doesn't necessarily have to travel to A to B in one fixed route (fundamental design of the Internet)
The quickest route isn't even necessarily in a straight line
How does data from my computer go to Google's servers or CNN's servers?
Turns out that every computer on the Internet has an address
Similar to a Postal Service address that uniquely identifies your location
computer has an IP (Internet Protocol) address that uniquely identifies it on the Internet
How many computers does this suggest can be on the Internet at once?
roughly 4 billion
This may seem like plenty, but its increasingly becoming tight
Essentially when you sign up with a cable modem or DSL provider is ONE IP address. You are being assigned a number
All IP addresses come in this form: #.#.#.# where each # denotes an integer, 0 through 255, inclusive
Any computer on the Internet speaks the language, or protocol, known as "TCP/IP"
Transmission Control Protocol/Internet Protocol
Actually two different protocols that are so often used together that we just refer to them as one
If you request a web page and you received a response:
Pages are written in HTML
Browsers communicate with web servers in HTTP
Everything in the Internet is spoken by TCP/IP
Confusing, but everything on the Internet is layered:
Physical connection, then on top you have TCP/IP (gets data from A to B) that can transmit HTTP data, that will transmit webpages in HTML format
Lets say I'm going to write an email.
When I hit Send, the computer realizes it needs to send the data over the Internet
The computer breaks up the data into smaller chunks
Each message goes inside a TCP/IP Packet will be sealed and numbered.
numbered so that we know the order to reassemble the packets
The packet then gets addressed with a "To" and a "From"
The "To" and "From" are in the forms of IP addresses. So, To: might be the IP address of the email server, and "From" is your computer's IP
The computer then sends the packets via TCP/IP (over a physical connection such as a wireless 802.11b/802.11g connection or an ethernet connection)
The packet is then sent to a switch which is transferred to a router
A switch understands only ethernet, a router understands TCP/IP (so that the router knows which direction to send a packet)
The router knows the next best hop (not necessarily physically the closest) to send the data
Packets can be dropped if a router gets too busy
The receiving computer knows that its missing data (because the packets are numbered), will wait a second or two, and then will send a request to retransmit the missing packet.
A sequence number (which identifies the number of the packet) is 32 bits long
A computer can be many servers: an HTTP server, SSH, FTP
If your computer only has one IP address, how does it distinguish between the different services?
There is a number that a packet includes that identifies the service type.
This number is called a "port"
Port 80 identifies HTTP, 25 identifies email for example
Can run a program called "traceroute"
Asks the computer to show the route between the computer and a particular server
% traceroute cnn.com
traceroute: Warning: cnn.com has multiple addresses; using 184.108.40.206
traceroute to cnn.com (220.127.116.11), 30 hops max, 38 byte packets
1 NW12-RTR-2-N42.MIT.EDU (18.104.22.168) 0.405 ms 0.358 ms 0.268 ms
2 EXTERNAL-RTR-1-BACKBONE.MIT.EDU (22.214.171.124) 13.961 ms 9.143 ms 5.619 ms
3 ge-6-23.car2.Boston1.Level3.net (126.96.36.199) 1.914 ms 4.257 ms 0.624 ms
4 ae-5-5.ebr1.NewYork1.Level3.net (188.8.131.52) 16.764 ms 5.518 ms 16.556 ms
5 ae-3.ebr1.Washington1.Level3.net (184.108.40.206) 22.406 ms * *
6 * ae-2.ebr1.Atlanta2.Level3.net (220.127.116.11) 28.836 ms *
7 ge-6-0-0-51.gar2.Atlanta1.Level3.net (18.104.22.168) 24.456 ms ge-6-0-0-53.gar2.Atlanta1.Level3.net (22.214.171.124) 24.837 ms ge-6-0-0-55.gar2.Atlanta1.Level3.net (126.96.36.199) 24.879 ms
8 pop2-atm-P0-2.atdn.net (188.8.131.52) 24.712 ms 24.878 ms 24.746 ms
9 bb2-atm-P0-1.atdn.net (184.108.40.206) 32.598 ms 32.673 ms 32.447 ms
MPLS Label=18 CoS=5 TTL=1 S=0
10 pop1-atl-P5-0.atdn.net (220.127.116.11) 32.666 ms 32.609 ms 32.657 ms
What does each row represent?
represents a hop
In this, we can see that it starts at MIT (mit.edu), is transferred to MIT's backbone, on to Boston's backbone (via Level3.net), to New York, Washington, to Atlanta.
To the right of the router's hostname is the IP address, to the right of that denotes the time it takes for a packet to reach that particular router.
So, it takes 32.666 milliseconds for data to travel to Atlanta!
This isn't sending emails or requests, just "empty packets" to ensure that we get a response from the router to fill this data
Lets go now across an ocean..
% traceroute cnn.co.jp
traceroute to cnn.co.jp (18.104.22.168), 30 hops max, 38 byte packets
1 NW12-RTR-2-N42.MIT.EDU (22.214.171.124) 0.395 ms 0.420 ms 0.484 ms
2 EXTERNAL-RTR-1-BACKBONE.MIT.EDU (126.96.36.199) 0.718 ms 1.234 ms 0.730 ms
3 leg-208-30-223-5-CHE.sprinthome.com (188.8.131.52) 0.589 ms 0.687 ms 0.662 ms
4 184.108.40.206 (220.127.116.11) 2.804 ms 2.807 ms 2.939 ms
5 sl-bb24-nyc-14-2.sprintlink.net (18.104.22.168) 46.617 ms 102.408 ms 7.479 ms
6 sl-bb21-nyc-6-0.sprintlink.net (22.214.171.124) 5.866 ms 5.832 ms 5.551 ms
7 sl-bb26-pen-4-0-0.sprintlink.net (126.96.36.199) 9.795 ms 9.809 ms 9.621 ms
8 sl-bb21-pen-9-0.sprintlink.net (188.8.131.52) 10.039 ms 10.061 ms 9.950 ms
9 sl-bb23-rly-0-0.sprintlink.net (184.108.40.206) 12.293 ms 12.333 ms 12.714 ms
10 sl-bb21-dc-13-0-0.sprintlink.net (220.127.116.11) 12.697 ms 12.731 ms 12.514 ms
11 sl-st21-ash-10-0.sprintlink.net (18.104.22.168) 20.400 ms 20.508 ms 20.460 ms
12 sl-splki-1-0.sprintlink.net (22.214.171.124) 13.250 ms 13.202 ms 13.116 ms
13 lax002bb00.IIJ.net (126.96.36.199) 88.318 ms 88.348 ms 88.195 ms
14 tky008bb00.IIJ.Net (188.8.131.52) 190.317 ms 190.370 ms 190.369 ms
15 tky007bb00.IIJ.Net (184.108.40.206) 190.884 ms 190.351 ms 190.543 ms
16 tky007ip01.IIJ.Net (220.127.116.11) 190.124 ms 190.390 ms 190.013 ms
17 18.104.22.168 (22.214.171.124) 190.362 ms 190.075 ms 190.237 ms
18 fs-c2-0-1-2.mfeed.net (126.96.36.199) 190.427 ms 190.480 ms 190.377 ms
19 fs-gs2-14.mfeed.net (188.8.131.52) 190.885 ms 190.821 ms 190.435 ms
20 184.108.40.206 (220.127.116.11) 191.207 ms 191.388 ms 190.899 ms
21 * 18.104.22.168 (22.214.171.124) 190.678 ms !X *
22 126.96.36.199 (188.8.131.52) 190.853 ms !X * 191.018 ms !X
You can see that our data travels like this: From MIT, to Boston's backbone, to New York City, to Pennsylvania, to Washington DC, over to Los Angeles, then directly to Tokyo!
The Los Angeles to Tokyo hop occurs at hop number 14
notice the jump in time that the server requires to respond across the Pacific ocean.
Generally, you can tell the location of the routers by the abbreviations in the hostname. Generally they are the same as airport codes, but not always (NYC = New York City, DC = Washington DC, LAX = Los Angeles, TKY = Tokyo)
End the lecture in a video that demonstrates TCP/IP: how packets are transferred