* Lecture 5: The Internet, continued
* (Begin with film on TCP/IP) "He's got your address"
wedge Before we begin into what TCP/IP is and what its all about, we have a raffle!
* Raffle for students who got 75% or better on PSet1
* First item: Apple Tshirt!
* Second item: iPod shuffle (first generation)
wedge Google Earth is just too cool to not revisit
wedge You will be tasked in PSet 4 to use Google Earth for a couple of questions
* Find popular tourist attractions somewhere in the world
* Find places of personal interest
wedge This is a wonderful opportunity to appreciate that you don't need training to use new software (or you shouldn't!).
* Ideally, you shouldn't need certifications or training to get around an application
* Its probably a fault of the application and not the user if its too difficult to use
* Lets go right away to 1 Oxford St (where the lectures are held)
* Can specify US address, foreign address, often places such as 'Eiffel Tower'
* One of the controls on the right side of the screen is a plus and minus sign, this lets you zoom in and out.
* You can move horizontally, vertically using the arrows in the same location.
* You can also click-and-drag to move around in the world.
* Above the horizontal/vertical control is a tilt bar that allows you to change the tilt of the Earth.
wedge Layers
* A lot of fun features of the program
* Terrain is a fun one to turn on - where Google Earth has height information it will show the proper height when you tilt the view
* You can also find food under 'dining', hotels under 'lodging', etc.
* The point of this demonstration is to 'wow'
wedge When you find a place you want to save, you click on the thumbtack icon and it places a thumbtack icon on the map
* This is a "placemark"
* Placemarks can be saved and submitted to friends (and to us for the Pset)
* Placemarks are stored in the left column in the middle box.
* In this box, create a new folder (one for each question, 3 and 4) and drag the placemarks for your problem set into these folders.
* When you're all done, right-click (Windows) or click-and-hold (Mac) the folder with your placemark and click "Save As"
* Save the file and submit it.
* When you submit it to us we will be able to load it into our own Google Earth and be whisked away to your chosen locations.
* See the lecture video for a short demonstration
wedge "Sightseeing" placemark - a Google default
* Grand Canyon shows the terrain layer quite well. (Make sure the Terrain layer is enabled)
* Boston, MA. Enable the "3d buidings" layer to show virtual models of buildings in downtown Boston. You can then tilt and see what the city looks like in a 3d perspective. Also, enable the "Transportation" layer and it will show you the T lines.
* Class requests: London! Big Bend National Park!
wedge Pset 4 is due in early November
* Not necessarily because its difficult, but because its involved
* You will be asked to go into a computer store and "purchase" a computer with "virtual" dollars
wedge The Internet
* Last week we focused on the services and what you can do on the Internet
wedge Let's take a step back: are the Internet and the World Wide Web the same thing?
* No
* The Internet is an infrastructure, a framework, a backbone on which applications and services run
* When we talk about the WWW we talk about something you can do on the Internet.
* Tonight is about lifting up the hood of the Internet. How it works, why it breaks.
* Hopefully you will be able to use technical words in the correct order so that if you have to call Comcast you will be able to get better help faster.
wedge Last week we talked about the smallest possible network that's interesting: a peer-to-peer, a connection between two computers
* The term is now used more generally, but it still means the same thing - a connection between two computers.
* Client-server connection, where a 'host' computer serves the client machine (similar to the restaurant analogy)
wedge What do we do if we have to add a 3rd computer to a peer-to-peer connection?
wedge Rather than connecting them all together, we could connect them all to a central computer, or a server
* The server could figure out which data needs to be sent to which computer
* Similarly we can just have a central device where all computers are connected
* Most network connections now use ethernet, RJ-45, a connection that appears to be a wide telephone jack (a telephone connector is RJ-11)
wedge Most computers come with only one ethernet jack, so if you connect two computers together directly you are unable to let your network grow.
* This is why we use a central location, which then has many jacks, to connect all the machines
* The central device is a switch, or a router
wedge Switch: relatively cheap and relatively dumb device that receives data from one computer, determines which computer it goes to, and sends it to another computer
* Switches aren't completely dumb, they don't just spew all information they receive to all connected computers
* We differentiate from routers because now-a-days most devices called a 'router' now has so many devices rolled into one: a switch, a router, an access point, a firewall, etc
* These are relatively cheap, being about $20
*
Switch
Switch

wedge Most common way to connect computers is Star topology
*
Star-Topology
Star-Topology

wedge Several years ago, the Ring approach was common
* software was written for this
* wiring used to be expensive and this minimized the amount of cable required
* Another method is a Bus network
wedge Next time you're in your office ask your IT guy to see the server room, or the wiring closet.
* What you will likely see is a huge box that can connect 24, 48, maybe even 100 computers together
wedge In section, we will actually create ethernet cables!
* Not the cabling itself, but we will attach the head of a cable to a raw ethernet cable
* Involves aligning 8 tiny wires in a certain pattern and attaching the connector to it
* Called crimping
* any of you who have an internet connection at home or at work use one of these cables
wedge Backbones
* How do you go about connecting LANs and WANs?
* ARPANET was a military project that had redundant connections between computers and is the precursor to the Internet
* A number of companies such as MCI, AT&T, etc (a lot of the old Telco companies) own vast systems of cabling that can transmit gigabytes/terabytes of data across the country, trans-Atlantic and trans-Pacific
* Many cables are fiber optic as oposed to the copper wiring that we find in ethernet cables
* Fiber optic cable allows the passage of light (rather than electrons) so that data transmission is even faster.
wedge Speeds 'n stuff:
wedge Cat-5 is capable of speeds at 100Mbps
* that's 100 million bits per second!
wedge Fiber optic capable of gigabit range
* Fiber optic cable is already useful: much of Harvard's network inter-building connections use fiber optic cabling
wedge What about renovating residential condo?
* Its probably premature to wire a residential home with fiber optic
wedge How fast is a typical cable modem connection?
wedge 1.5Mbps - 8Mbps
* compare to Cat-5's capability of 100Mbps
wedge Tends to be more expensive
* Comcast, $40-$60/month
wedge DSL modem typical speed
wedge 768kbps - few Mbps
* $15-35/month or so (tends to be less than cable modem)
wedge Even these connection numbers are misleading, download speed can be very fast but upload (sending data) can be much slower
* Maxiumum upload speed: "much slower!": 768kbps
wedge Why is it ok for the typical user to have this asymmetry?
* You tend to download far more than you actually upload
* You're sending very little requests and getting a relatively large amount of data in return.
wedge FiOS - Fiber Optic Service
* If you have this, we are jealous!
* Fiber optic technology for delivering the Internet to your home
* 30Mbps download, 5Mbps upload
* With a central switch/router in your home, the connection on your local area network (the network between the computers in your home) is much faster than the connection to the Internet.
* So, generally the bottleneck when connecting to the Internet is between your network and the service provider.
wedge Why might you want a really fast connection locally (in your local network) even if your connection to the Internet is not as fast?
* So you can transfer files to other computers on your network at a much faster rate
wedge You might have wireless access in your home
* Technologies such as 802.11b and 802.11g make this possible
* If you have a laptop or a desktop with wireless capabilities you likely have a machine capable of one or both technologies
wedge 802.11b: 11Mbps
* signal strength dependent
* depends on how strong a signal you have, the farther away you are from the access point or if there are obstructions this will go down
wedge 11Mbps is considerably slower than the 100Mbps of Cat-5, but why is this OK?
* Internet connections generally aren't much faster than this anyway (look at Comcast's max download speed: 8Mbps)
wedge 802.11g: 54Mbps
* also signal strength dependent
* even with 1/2 speed due to obstructions, it is still enough for some of the fastest consumer Internet connections.
wedge Marketing speak:
* Wifi - wireless fidelity
* WiMAX - a nice experiment that will try to give fast wireless access to an entire municipality
wedge
* Point of presence, peering point: areas of the Internet where major Internet connections flow into and out of.
wedge We already know that data doesn't necessarily have to travel to A to B in one fixed route (fundamental design of the Internet)
* The quickest route isn't even necessarily in a straight line
wedge How does data from my computer go to Google's servers or CNN's servers?
wedge Turns out that every computer on the Internet has an address
* Similar to a Postal Service address that uniquely identifies your location
* computer has an IP (Internet Protocol) address that uniquely identifies it on the Internet
wedge IP Address
* 32-bits long
wedge How many computers does this suggest can be on the Internet at once?
* "a lot!"
* roughly 4 billion
* This may seem like plenty, but its increasingly becoming tight
* Essentially when you sign up with a cable modem or DSL provider is ONE IP address. You are being assigned a number
* All IP addresses come in this form: #.#.#.# where each # denotes an integer, 0 through 255, inclusive
wedge Any computer on the Internet speaks the language, or protocol, known as "TCP/IP"
* Transmission Control Protocol/Internet Protocol
* Actually two different protocols that are so often used together that we just refer to them as one
wedge If you request a web page and you received a response:
* Pages are written in HTML
* Browsers communicate with web servers in HTTP
* Everything in the Internet is spoken by TCP/IP
wedge Confusing, but everything on the Internet is layered:
* Physical connection, then on top you have TCP/IP (gets data from A to B) that can transmit HTTP data, that will transmit webpages in HTML format
wedge Lets say I'm going to write an email.
* When I hit Send, the computer realizes it needs to send the data over the Internet
* The computer breaks up the data into smaller chunks
wedge Each message goes inside a TCP/IP Packet will be sealed and numbered.
* numbered so that we know the order to reassemble the packets
* The packet then gets addressed with a "To" and a "From"
* The "To" and "From" are in the forms of IP addresses. So, To: might be the IP address of the email server, and "From" is your computer's IP
* The computer then sends the packets via TCP/IP (over a physical connection such as a wireless 802.11b/802.11g connection or an ethernet connection)
* The packet is then sent to a switch which is transferred to a router
* A switch understands only ethernet, a router understands TCP/IP (so that the router knows which direction to send a packet)
* The router knows the next best hop (not necessarily physically the closest) to send the data
* Packets can be dropped if a router gets too busy
* The receiving computer knows that its missing data (because the packets are numbered), will wait a second or two, and then will send a request to retransmit the missing packet.
* A sequence number (which identifies the number of the packet) is 32 bits long
wedge A computer can be many servers: an HTTP server, SSH, FTP
* If your computer only has one IP address, how does it distinguish between the different services?
* There is a number that a packet includes that identifies the service type.
* This number is called a "port"
* Port 80 identifies HTTP, 25 identifies email for example
wedge Can run a program called "traceroute"
* Asks the computer to show the route between the computer and a particular server
* % traceroute cnn.com
traceroute: Warning: cnn.com has multiple addresses; using 64.236.16.20
traceroute to cnn.com (64.236.16.20), 30 hops max, 38 byte packets
1 NW12-RTR-2-N42.MIT.EDU (18.152.0.1) 0.405 ms 0.358 ms 0.268 ms
2 EXTERNAL-RTR-1-BACKBONE.MIT.EDU (18.168.0.18) 13.961 ms 9.143 ms 5.619 ms
3 ge-6-23.car2.Boston1.Level3.net (4.79.2.1) 1.914 ms 4.257 ms 0.624 ms
4 ae-5-5.ebr1.NewYork1.Level3.net (4.69.132.250) 16.764 ms 5.518 ms 16.556 ms
5 ae-3.ebr1.Washington1.Level3.net (4.69.132.89) 22.406 ms * *
6 * ae-2.ebr1.Atlanta2.Level3.net (4.69.132.85) 28.836 ms *
7 ge-6-0-0-51.gar2.Atlanta1.Level3.net (4.68.103.7) 24.456 ms ge-6-0-0-53.gar2.Atlanta1.Level3.net (4.68.103.71) 24.837 ms ge-6-0-0-55.gar2.Atlanta1.Level3.net (4.68.103.135) 24.879 ms
8 pop2-atm-P0-2.atdn.net (66.185.138.33) 24.712 ms 24.878 ms 24.746 ms
9 bb2-atm-P0-1.atdn.net (66.185.147.210) 32.598 ms 32.673 ms 32.447 ms
MPLS Label=18 CoS=5 TTL=1 S=0
10 pop1-atl-P5-0.atdn.net (66.185.136.35) 32.666 ms 32.609 ms 32.657 ms

wedge What does each row represent?
* represents a hop
* A router!
* In this, we can see that it starts at MIT (mit.edu), is transferred to MIT's backbone, on to Boston's backbone (via Level3.net), to New York, Washington, to Atlanta.
* To the right of the router's hostname is the IP address, to the right of that denotes the time it takes for a packet to reach that particular router.
* So, it takes 32.666 milliseconds for data to travel to Atlanta!
* This isn't sending emails or requests, just "empty packets" to ensure that we get a response from the router to fill this data
* Lets go now across an ocean..
* % traceroute cnn.co.jp
traceroute to cnn.co.jp (210.173.169.160), 30 hops max, 38 byte packets
1 NW12-RTR-2-N42.MIT.EDU (18.152.0.1) 0.395 ms 0.420 ms 0.484 ms
2 EXTERNAL-RTR-1-BACKBONE.MIT.EDU (18.168.0.18) 0.718 ms 1.234 ms 0.730 ms
3 leg-208-30-223-5-CHE.sprinthome.com (208.30.223.5) 0.589 ms 0.687 ms 0.662 ms
4 144.232.21.49 (144.232.21.49) 2.804 ms 2.807 ms 2.939 ms
5 sl-bb24-nyc-14-2.sprintlink.net (144.232.19.185) 46.617 ms 102.408 ms 7.479 ms
6 sl-bb21-nyc-6-0.sprintlink.net (144.232.13.186) 5.866 ms 5.832 ms 5.551 ms
7 sl-bb26-pen-4-0-0.sprintlink.net (144.232.20.142) 9.795 ms 9.809 ms 9.621 ms
8 sl-bb21-pen-9-0.sprintlink.net (144.232.5.245) 10.039 ms 10.061 ms 9.950 ms
9 sl-bb23-rly-0-0.sprintlink.net (144.232.20.32) 12.293 ms 12.333 ms 12.714 ms
10 sl-bb21-dc-13-0-0.sprintlink.net (144.232.9.214) 12.697 ms 12.731 ms 12.514 ms
11 sl-st21-ash-10-0.sprintlink.net (144.232.20.148) 20.400 ms 20.508 ms 20.460 ms
12 sl-splki-1-0.sprintlink.net (144.223.246.26) 13.250 ms 13.202 ms 13.116 ms
13 lax002bb00.IIJ.net (216.98.96.175) 88.318 ms 88.348 ms 88.195 ms
14 tky008bb00.IIJ.Net (216.98.96.178) 190.317 ms 190.370 ms 190.369 ms
15 tky007bb00.IIJ.Net (58.138.98.9) 190.884 ms 190.351 ms 190.543 ms
16 tky007ip01.IIJ.Net (210.130.143.14) 190.124 ms 190.390 ms 190.013 ms
17 202.232.13.74 (202.232.13.74) 190.362 ms 190.075 ms 190.237 ms
18 fs-c2-0-1-2.mfeed.net (210.173.161.115) 190.427 ms 190.480 ms 190.377 ms
19 fs-gs2-14.mfeed.net (210.173.161.86) 190.885 ms 190.821 ms 190.435 ms
20 210.173.162.220 (210.173.162.220) 191.207 ms 191.388 ms 190.899 ms
21 * 210.173.162.220 (210.173.162.220) 190.678 ms !X *
22 210.173.162.220 (210.173.162.220) 190.853 ms !X * 191.018 ms !X
* You can see that our data travels like this: From MIT, to Boston's backbone, to New York City, to Pennsylvania, to Washington DC, over to Los Angeles, then directly to Tokyo!
* The Los Angeles to Tokyo hop occurs at hop number 14
* notice the jump in time that the server requires to respond across the Pacific ocean.
* Generally, you can tell the location of the routers by the abbreviations in the hostname. Generally they are the same as airport codes, but not always (NYC = New York City, DC = Washington DC, LAX = Los Angeles, TKY = Tokyo)
* End the lecture in a video that demonstrates TCP/IP: how packets are transferred