Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Socket Programming

by Gorby (Monk)
on Jan 07, 2004 at 04:42 UTC ( [id://319358] : perlquestion . print w/replies, xml ) Need Help??

Gorby has asked for the wisdom of the Perl Monks concerning the following question:

Hello Wise Monks. Happy New Year!

I'm trying to make a program using tcp/ip sockets and I have some questions:

1) What is a packet? When my program reads from the socket, does it read only one packet?

2) If the perl application on the other end sends the words "hello" and "there" separately by using 2 separate print statements, will my progam receive both words at once ("hello there") when it reads the socket?

Thanks in advance for any wisdom you can provide.


Replies are listed 'Best First'.
Re: Socket Programming
by etcshadow (Priest) on Jan 07, 2004 at 06:26 UTC
    The internet, as defined by IP (internet protocol) is actually a crazy, complicated, unreliable thing. The basic way that it works is by carrying bundles of data (called packets) around. Basically, all that a packet is is a sort of digital postcard. It has a from address and a to address, and room to write in some "payload" (the information you want carried from here to there).

    Now, these packets are NOT guaranteed to arrive where they're going. They are NOT guaranteed to arrive in order if and when they get where they are going. They are not even guaranteed to contain the same data on arrival as they did upon sending. Holy crap!

    TCP is a technology built ON TOP of IP. TCP stands for Transfer Control Protocol. TCP presents an abstraction that looks like a pipe. That is, you send a stream of data in, and you get the ssame stream of data out. This is actually performed through a complicated mechanism called the TCP stack, which I won't explain in detail... but basically consists of: breaking your data stream up into little bunches of data, numbering those bunches of data, and putting a "checksum" on each bundle of data (so that the receiver can verify that it didn't get smudged in transit). Then each side keeps track of which numbered bundles have been sent and which have been received, and retransmits any that got lost in route, and ignores any duplicates of the same, and so on and so on. The packets that TCP passes over IP correspond to these bundles of data, and the control data that goes with them (that is, the checksum, and the sequence number (as well as the "port" number... which is used to keep track of one tcp session out of possibly several going on at the same time on a particular computer)), as well as some extra packets that are used to do things like initiate the conversation, shut it down, and acknowledge recepit of packets or ask for retransmits.

    Anyway, what this means to you, when using a TCP socket is: it behaves, to your program, in almost exactly the same manner as a pipe would. So, for example, whether the data is sent imediately through the pipe/socket or sent later (after spending some time in a buffer) is dependant on wheteher the socket's autoflush property is turned on. By default, perl sockets have autoflush turned on (since version 1.18 of IO::Socket).

    I would urge you not to worry about packets when dealing with TCP... that is the beauty of it, somebody else did all of the worrying about packets, so you don't have to... you worry about a stream, which is a much easier concept to program to. The truth is that it is possible for "hello" to be split into two or even five packets (although it is most likely that it would all go in one). You haven't got much real control over that, and you shouldn't worry about it. In fact... even if the socket is auto-flushing, it's possible (but unlikely) that "hello" and "world" could end up sharing the same packet, because tcp stack is sending data out slower than your program is sticking data in.

    What it comes down to is: don't worry about the packets... you should frame whatever conversation you are carrying out over TCP such that it is clear which side of the conversation is doing what and when. Think about having a conversation on a walky-talky, at any given time one of you is talking and one of you is listening... if you both try to talk at the same time, or if both of you think the other guy is the one that should talk next... it all goes to hell. This, for example, is roughly what http looks like:

    you to server: hi there, please send me /index.html.  *over*
    server to you: alright, here's the page: <blah blah html, blah> *over-and-out*

    Of course, you have to be careful to treat the magic words *over* and *over-and-out* specially... much like how you have to treat the quote character specially when inside of a quoted string (so you don't confuse *meaning* a quote, from *using* a quote for the purpose of marking the beginning and end).

    ------------ :Wq Not an editor command: Wq
Re: Socket Programming
by duff (Parson) on Jan 07, 2004 at 05:26 UTC

    1) What is a packet? When my program reads from the socket, does it read only one packet?

    A "packet" is kind of one of those made-up terms that means whatever the user wants it to mean. In the realm of TCP/IP a "packet" is a sequence of bytes that contains a sequence number, source address, destination address, and some data. As far as socket programming goes, these things are well below the level of detail you usually have to worry about. However, a particular protocol (for instance)may use the term "packet" though and would probably be referring to an individual chunk of meaningful data that is sent over a socket.

    2) If the perl application on the other end sends the words "hello" and "there" separately by using 2 separate print statements, will my progam receive both words at once ("hello there") when it reads the socket?

    Actually it would receive "helloworld" or perhaps "h", "e", "l", "l", etc. (depending on how you're reading) if you sent "hello" then "world" in separate print statements, but yes that's how it works. Think of sockets like you would pipes; whatever you put in them on one end can be read on the other end in the same order that it was put in.

    I'm sure there are better descriptions out there, but I don't have any references handy, sorry. Google for "unix network programming" though and you'll find the definitive reference.

      I highly recommend "Network Programming with Perl" by Lincoln Stein. It's one of my favourite Perl books of all time

        I highly recommend "Network Programming with Perl" by Lincoln Stein. It's one of my favourite Perl books of all time

        ...I second that recommendation, I can't recommend that book enough! It is one my favourite Perl Books as well.


Re: Socket Programming
by NetWallah (Canon) on Jan 07, 2004 at 05:59 UTC
    I'd strongly recommend Net::EasyTCP for socket programming - this is an area with many stumbling blocks, and that module gets you functional FAST.

    I Could not resist adding this to the response .. (Author unknown , and apologies to the good Doctor)

    If a packet hits a pocket on a socket on a port,
    And the bus is interrupted as a very last resort,
    And the address of the memory makes your floppy disk abort,
    Then the socket packet pocket has an error to report.

    If your cursor finds a menu item followed by a dash,
    And the double-clicking icons put your window in the trash,
    And your data is corrupted 'cause the index doesn't hash,
    Then your situation's hopeless, and your system's gonna crash!

    If the label on your cable on the gable at your house,
    Says the network is connected to the button on your mouse,
    But your packets want to tunnel to another protocol,
    That's repeatedly rejected by the printer down the hall.

    And your screen is all distorted by the side effects of gauss,
    So your icons in the window are as wavy as a souse,
    Then you may as well reboot and go out with a bang,
    'Cause as sure as I'm a poet, the sucker's gonna hang!

    When the copy of your floppy's getting sloppy on the disk,
    And the microcode instructions cause unnecessary RISC,
    Then you have to flash your memory and you'll want to RAM your ROM,
    Quickly turn off your computer and be sure to tell your mom!

    "When you are faced with a dilemma, might as well make dilemmanade. "
Re: Socket Programming
by Aragorn (Curate) on Jan 07, 2004 at 12:34 UTC

      Those are excellent picks, but I wouldn't hesitate to recommend reading Network Programming with Perl in order to make those even more accessible.

      Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"
      You are indeed correct, Stevens is the ultimate arbiter here. But, from the tone of the original posting, I suspect that Stein will be a more appropriate introduction.

      (I remember how I felt after the first 30 or so pages of TCP/IPI-v1....)

      I Go Back to Sleep, Now.


Re: Socket Programming
by flyingmoose (Priest) on Jan 07, 2004 at 13:40 UTC
    UDP has not been mentioned yet. While TCP/IP is a protocol that is stream based (like a pipe), UDP is not. UDP can only send one packet at a time, and the packets are not guaranteed to arrive or arrive in the same order. You might use UDP in a situation in a situation where your packets are merely informational (ex: systems monitoring) and you do not need to send huge chunks of ordered data. Essentially, UDP will be faster and will not cause problems for the sending application if the recipient does not behave properly. For 95% of the applications out there, though, TCP/IP is preferable.
      > You might use UDP in a situation in a situation

      Or in a situation where you do not care about duplicate packets ;)


Re: Socket Programming
by zentara (Archbishop) on Jan 07, 2004 at 17:20 UTC
    For some real experience, why don't you set up ethereal and actually watch the packets coming and going with a simple test script. Ethereal shows all the packets, the various fields, and the data payload.
Re: Socket Programming
by osama (Scribe) on Jan 08, 2004 at 15:50 UTC
    You don't need to worry about all that... (especially packets and how they really work).

    All modern operating systems and programming languages (and libraries) allow access on a higher level, actual access to packets is rarely needed (except for sniffing, intrusion detection, ...etc.);

    If you want to create a client and a server I recommend using Net::EasyTCP as another poster said.

    See my post about making any script that uses stadard input and standard output to Servers using xinetd in this thread.
Re: Socket Programming
by ambrus (Abbot) on Jan 09, 2004 at 12:17 UTC
    Read glibc's info manual, it contains a good introduction on this topic. You can read it even if you don't use linux (thus you don't have glibc).