Saturday, July 3, 2010

Networking

When people hear the word "Networking", they mostly think of social networking like Facebook, Linkedin,and Twitter..I would think the same way honestly since they are really in our lives now..But here what I meant with networking was actually TCP/IP networking. I know it is not that fun..
What is TCP and what is IP? And what about UDP..I am sure interested readers have once in a life time tried ping and traceroute. Is it UDP again? Or something different such as ICMP maybe.
What are all these protocols?
How do you think the requests we send from our browsers get to the destination server, retrieve the data and bring it to us, clients? Have you ever thought what the infrastructure is like? As I am very curious about the organization of a computing system and the hardware abstraction via software, I am very interested in seeing what is going on in the Network layers so that the data can go back and forth. We do video chat, and web browsing, we stream videos, audio files..So how does that work? The OS handles all those things for us. And the socket API is the greatest factor in its success. But how does it work? If you want to observe how it works in real-time, I would recommend you use Wireshark tool. This is a free software and also little bit dangerous :) But if you know what you need, you can really get with this tool.
I will try to explain what you are going to see in Wireshark when you follow the protocols and the information in each packet using those protocols. The protocols you will see mostly are UDP, TCP, ICMP, DNS, DHCP, SSDP..I am sure most of you know DNS. DNS is like a name lookup in an address book. How it works? The DNS translates the name to an 32 bit IP address. How your client knows the DNS server is another issue. This information is mostly hardcoded in your browser.
DHCP, another important concept of networking. It is Dynamic Host Configuration Protocol. When a computer boots-up, it broadcasts its address in the network. And the DHCP in the network assigns an IP address to this physical address. This is what DHCP does.
UDP and TCP are transport layer protocols, which are connectionless and connection-oriented respectively. If you are familiar with the 3-way handshake connection establishment and 4-way teardown in TCP, the Wireshark helps you understand how, for example http works. TCP is basically a reliable, connection oriented protocol that ensures the data delivery. But it is costly to the system. To understand what I mean, you need to have a good understanding of the TCP mechanism, acknowledgements, flow control, and re-transmission issues. There is a trade-off between UDP and TCP. If you care about the speed rather than accurate delivery than you would probably prefer UDP. What happens in TCP is that when a TCP client starts to establish a connection, an active open, it sends a SYN package to the destination server, thus starts the 3-way handshake. The server transitions to the passive open state from the listening state and sends a SYN,ACK package to the client acknowledging that it received the SYN package. (Client and server has initial sequence numbers that they use in the first SYN and SYNACK packages) The client sends the ACK package and then sends the first data segment to the client. This state is Established state where data exchange occurs between the client and the server. One side has to send FIN to end the connection. Generally the client sends the first FIN package and starts the active close while the receiving side, server, switches to the passive close state. If you search for TCP state machine, especially read the RFC793, I think you will be more clear. ANd the passive-closing side sends a FIN after sending the ACK for the last received FIN. After receiving the ACK for the last FIN sent, it transitions to the Listen state. The states are very slightly different for the server and client, but I can say almost the same. So this is what makes TCP expensive. It makes sense that the video streaming applications use UDP rather than TCP. I can not think of 3-way handshake at the beginning of connection and re-transmission for all lost packages sent from Youtube servers. They would crash probably. Even though UDP is fast, the receiver still uses buffering mechanism in order to display the video in the correct order and with least delay. The speed is very important and, there is also RTP/RTCP protocols that can patch for the lost data package for audio applications. There is one more thing here.Have you ever thought how it happens? How does the audio and video syncronize..Wouldn't it be weird if what we hear and see were not synchronized? RTCP, here it comes :) The user does not really understand as long as there are not very long delays actually..

Fianlly, there are some applications which use ICMP, IGMP and passes over the transport layer. The application directly deals with the IP layer. And you have to be superuser in order to directly access to the IP layer. This is tricky actually because you can easily crash a server if you know how to play with ICMP. Because it provides you the ability to change the IP address of the sender, and you can send many request to too many computers, and route these replies to another computer that you use the IP address of. But rather than for bad intentions, ICMP is used to troubleshoot the network with traceroute and ping. We are engineers, we need to serve for good. Engineering ethics :)

Well that is all for now, I was planning to write more technical and implementation details rather than just describing the concept but I will do it in my next post. How is this implemented? Good question. Please read the RFCs and explore the Linux Socket API functions and structures.
They are designed to implement all this theory.
Good night.Enjoy the sunny sunday..

No comments:

Post a Comment