http://www.perlmonks.org?node_id=88088

IkomaAndy has asked for the wisdom of the Perl Monks concerning the following question:

Well, for the first time, I'm having to really look at performance in a Perl program, and am a bit stuck.

Basically, I'm trying to capture netflow information from a Cisco router. It's quite active, spitting out about 30+ UDP flow packets a second at peak times, each with 30 flows, for around 1,000 flows a second. I can capture these flows (using Tony Hariman's fdgetall as a base, but as soon as I start to unpack them and do anything useful with them, I start dropping packets. I seem to be doing a much better job by keeping an interal buffer of packet data, and, upon reaching an arbitrary count of say 1,000 packets, forking (zeroing the buffer in the parent) and letting the child process the data. How is such a program typically written?

Ultimately, I'd like to store the flows in a mysql database, with a daemon combining, say, flows over X hours old into 5 minute averages, etc... to keep sizes and searches reasonable.

2001-06-16 Edit by Corion : Changed title

  • Comment on How can I maximize performance for my router netflow analyzer? (was: Perl Performance Question)

Replies are listed 'Best First'.
(tye)Re: Perl Performance Question
by tye (Sage) on Jun 13, 2001 at 19:49 UTC

    If there is one thing you can say today, it is "disk is cheap".

    I'd just have the collector write the data to disk files with sequence numbers or dates in their names. So either set a maximum file size of 400MB (for example) and just go to the next sequence number when you hit that or go to a new file every hour or N minutes.

    Then have a separate process that extracts data from these files, summarizes it, stores it in a more permanent place, and finally deletes the file when it is sure that both it and the file writer are done with it (or have a separate step that deletes files so you can recover if you find a bug in the analysis or can purge unanalyzed files if the backlog gets really, really huge).

    I'd think that any other scheme is going to be pretty vulnerable to loss of data.

            - tye (but my friends call me "Tye")
      This is something I was thinking of. Hopefully, if the "processor" can't keep up with the "gatherer," it would be able to make up for lost time during non-peak times.
        I implemented this "disk queue" system, and found it very slow, even using a memory-based /tmp filesystem (on Solaris 2.6). I could only process about 200 flows per second, as opposed to the ~1,000 per second that should be gathered. That's a pretty extreme backlog-- I don't know if I would even be able to consume it during offpeak times. I didn't really see a slowdown in gathering when the "processor" was running, though, something I wondered about when writing to disk.
Re: Perl Performance Question
by Henri Icarus (Beadle) on Jun 13, 2001 at 19:41 UTC
    Another thing you could try to use is a pipe. Try writing a simple gather-script that just spits out the data to the standard output. Then write an independent process-script that does an open "gather-script |" and processes the data. This second one is the one that you execute. That way you've seperated the gathering and processing into two processes but there's no coding of forking or anything, and the buffering is handled by the OS.

    -I went outside... and then I came back in!!!!

      The problem with that is that if he is dropping packets, his machine isn't fast enough to process them in real-time which means the pipe will eventually fill up and writes to the pipe will block, thus causing him to miss packets sent while the write is blocked.
Re: Perl Performance Question
by dragonchild (Archbishop) on Jun 13, 2001 at 19:30 UTC
    One option would be to write a collector script which would then fork, for each packet received, an analyzer script which would take the packet, do stuff to it, then save stuff to a mysql DB. At that point, you can have whatever daemons you want look at that DB, irregardless of how that DB is populated.

    I'm not fully conversant on how fork works, but I'm sure a number of people here are. Plus, you could just play with it. :)

      I've tried this, but seemed to take a relatively bad performance hit with each fork. This is why I collected 1,000 packets before forking in the current idea, losing a few packets out of each thousand.
Re: Perl Performance Question
by mikfire (Deacon) on Jun 14, 2001 at 09:53 UTC
    Just to throw my two pennies into the mix, I am not surprised that either forking for every packet or implementing the disk queue was slow. Both forking and disk I/O have high overheads. I really think your initial solution was a pretty fair idea.

    If this is still too slow, I might suggest a three-process approach. The packet catcher will spawn a child every 1000 packets. The child will spit the packets to disk while the catcher gets back to work. A third process watches for new files to be created ( maybe naming each file with the child's PID so you would know when it exited and the file was complete ) and does the parsing. This would gain some speed since the third process could keep a permanent connection to the database - DBI->connect is slow. It also makes this almost nightmarishly complex.

    Just brainstorming now. What if you were to use ( since I just offered an answer using these ) one of the IPC shared memory modules? Using some kind of ring buffer in the shared memory, this would allow the parent and child to work asynchronously. It would also elminate some significant overhead, as the child could hold a more or less permanent connection open to the database.

    mikfire

Re: Perl Performance Question
by mattr (Curate) on Jun 14, 2001 at 11:46 UTC
    What kind of processing are you doing? Could you do it a lot faster if you do a bunch of packets at once?

    You need a large FIFO buffer filled by a capture process that is niced fast enough to handle your I/O requirements. I should think Perl could handle it if you are not asking for anything crazy. Some You could probably even just dump whole blocks of packets into mysql records; they can get very large. I also glanced at PDL::IO::FastRaw which suggests mmapping a file for faster access but that seems like overkill.

    Then another process would come in periodically (or more nicely) to service that buffer doing the data reduction and analysis you need, assuming that this is necessary because of a long capture session. It sounds like right now you are getting caught in overhead. One thing I can say is that you might save a lot of time if you can get Mysql to do the reduction on a lot of records at once and store results in a separate table. Another thing you could look at is using study() before running a batch. You could also look for a module which runs batch processing in C.

Re: Perl Performance Question
by Spudnuts (Pilgrim) on Jun 13, 2001 at 23:06 UTC
    Would MRTG work instead? It's a pretty nice traffic grapher; it's also free and fairly easy to set up.
      No, but I'm using it successfully to complement this information. These flows are src/dst IP pairs, and can be used to, among other things, track data usage by IP address for billing or track security violations.
Re: Perl Performance Question
by Tuna (Friar) on Jun 14, 2001 at 03:30 UTC
    My job is to collect, process and analyze NetFlow stats for a Tier-1 provider. I accomplish this using cflowd and Perl. As a typical collection for me usually involves in the tens-of-thousands of flows per second, I can emphatically say that in no way is Perl incapable of handling the flow rate you are describing. I too, am using netmatrix aggregation. And, I'm on the cusp of fully automating flow collection from about 25,000 interfaces worldwide!!! Msg me for some more details, if you care to.
Re: Perl Performance Question
by Mungbeans (Pilgrim) on Jun 14, 2001 at 14:44 UTC
    You may be able to multi/thread this without using forks. I don't know what your data looks like so this may or may not work.

    Architecture: 1 master co-ordinating process, an arbitrary number of children (depending on CPU's, OS) that do the work.

    • Packet comes in, master assigns it to a child.
    • Child reads packet and processes it, dumping the output in a central repository

    The key bit is the children are always alive (you don't launch them with fork as this has a start up hit) but they're quiescent unless they've got something to do. The communication between the master and child processes needs to be very fast (disk io probably too slow) but you could use IPC (unix interprocess communication) between the processes which is faster I think.

    If you keep losing packets, then add more children. This should work well if you have multiple CPUs on Unix. You will start to get processor bound with too many children.

    Caveat: I haven't done this, I've seen it done in Informix 4gl which is much less functional than Perl. There should be some CPAN modules which look after IPC for you.