Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Mmap question

by zentara (Archbishop)
on Jan 30, 2005 at 23:29 UTC ( #426457=perlquestion: print w/ replies, xml ) Need Help??
zentara has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm trying to determine how mmap works with Perl and Sys::Mmap on linux. I have a c program, which when I run it, I can see it's shmid listed with "ipcs -m". That's all well and good. So I'm trying to create the simplest Perl program to do the same thing, but I can't seem to get it to show it's shmid with ipcs -m. I'm wondering if I'm thinking about this wrong, or whether I'm setting it up right, or whether there is something odd with my linux setup. I'm using kernel 2.6.10. Anyways, here is the code, if anyone can show me how to make the shmid show up in "ipcs -m" I would be grateful.

Also the shmid, shown by Perl, dosn't seem to match the format listed by ipcs...is it a decimal versus hex conversion?

Update: Corrected return value for mmap, thanks Aristotle

#!/usr/bin/perl use warnings; use strict; use Sys::Mmap; my $mmap; #open( FH, '<:mmap',"$0" ) open( FH, "< $0" ) or die "Couldn't open 'file': $!\n"; my $smaddress = mmap( $mmap, 0, PROT_READ, MAP_SHARED, FH ) or die "mmap error: $!\n"; print "$smaddress\n\n$mmap\n\nCheck ipcs -m and hit enter\n\n"; <>; munmap($mmap) or die "mmunmap error: $!\n"; close(FH) or die "Couldn't close 'file': $!\n";

I'm not really a human, but I play one on earth. flash japh

Comment on Mmap question
Download Code
Re: Mmap question
by Aristotle (Chancellor) on Jan 31, 2005 at 00:26 UTC

    You misunderstood. $shmid is the wrong name, you are getting the address of the mmapped area, not a shared memory ID. mmap(2) has nothing to do with the SysV IPC facilities.

    I am not entirely certain on how to use mmap(2) as an alternative to shared memory, since I've never done that, but my reading of the docs and manpage suggest that it is limited to passing down the mmap to child processes on fork. I could be wrong here though, as I said, since I've never tried to use the facility in that capacity.

    Makeshifts last the longest.

      Your quite right about being passed on fork (but it is not copied across exec's if working at the C level).

      And zentara I think your right about the octal to decimal conversion.

      As for using mmap(2) for IPC, Stevens demonstrates a technique for mmap across related processes (you cant use mmap across unrelated processes, because the address is for that process family. If an unrelated process tries to use it, you'll get a SIGSEGV or SIGBUS core dump). He creates a mmap'd area from /dev/zero, then has co-operating parent and child increment a long integer in turn.

      Stevens last word on mmap and shared memory is "If shared memory is required between unrelated processes, the shmXXX functions must be used"

      ...it is better to be approximately right than precisely wrong. - Warren Buffet

Re: Mmap question
by sgifford (Prior) on Jan 31, 2005 at 02:15 UTC
    You can certainly use mmap to share between unrelated processes. Create a file of the appropriate size, then mmap it from two processes; each will be able to see the other's changes.

    Here's an example. mmshare writes lines from its STDIN to an mmap'd file, and mmlisten waits for that, then prints them. You have to be a bit careful how you write to the file; writing more than one machine word isn't atomic. So these programs divide the file up into segments, and use the first byte as a segment identifier which can be read and written to atomically.

    To use them, first create an empty file to mmap:

    dd if=/dev/zero bs=1 count=8192 of=mm
    then run mmlisten mm in one window, and mmshare mm in another. As you type lines into mmshare, they should show up in the mmlisten window.

    Update: To answer Aristotle's question, AFAIK there's no way to share an anonymous mmap region between unrelated processes.

    To answer leriksen's question, it's actually sharing the memory backed by the file, just as in some OS's all real memory is backed by swap (Solaris, IIRC). Changes are available immediately, at least as immediately as with any other shared memory scheme. It does periodically write the data out to disk, but changes are visible before it's written out.

      Right, but there's no way to share an mmap between two processes without using a file, is there?

      Update: without, not with. Way to look dumb, Aristotle.

      Makeshifts last the longest.

      Well I guess thats true, but isnt that more a case of sharing the file, rather than the memory ?

      I believe the OP wanted to share the memory allocated my mmap (sharing the address returned by mmap, which isnt possible across unrelated processes).

      Maybe that just being pedantic - changes made to the mmap'ed region by one process are almost immediately visible by the other - if thats the level of control required, cool. shmXXX allows for finer grained control, by using data structures in memory - possible in a file but not without some serious work serialising to and from the file representation.

      ...it is better to be approximately right than precisely wrong. - Warren Buffet

        I don't understand this:
        shmXXX allows for finer grained control, by using data structures in memory - possible in a file but not without some serious work serialising to and from the file representation.
        How does a SysV shared memory segment support data structures differently than mmap? I've always used them pretty much interchangeably...
Re: Mmap question
by zentara (Archbishop) on Jan 31, 2005 at 13:09 UTC
    Thanks for all the input and informative examples! Leriksen is right, in that I was aiming at sharing the mmap memory by different unreleated processes. But my original question seems to be answered by Aristotle... there are different meanings to "mmap", and apparently Perl dosn't do SysV IPC mmap yet.

    I guess I thought I could use it like a private "ramdisk".


    I'm not really a human, but I play one on earth. flash japh
      Whomever said that mmap() has nothing to do with SysV shm was absolutely right. Now you're confusing the topic and yourself further by throwing around the word "ramdisk". If your want a ramdisk, create a ramdisk with Linux and mount it (man mount) whereever you like, and then use the normal file I/O operations (open, close, print, readline, and so on). Perl has SysV IPC primitives built-in if your operating system supports them (and in your case, it does). perldoc -f msgctl, perldoc -f msgget, perldoc -f msgrcv, perldoc -f msgsnd. These are all named after the kernel calls of the same name which means you can type man msgctl, man msgget, and so on, to read what Linux's man pages have to say about these functions that Perl gives you access to.

      In exactly the same way, mmap() is a function name, and typing man mmap will pull up the documentation on Linux's mmap() that Perl just happens to give you access to by way of the Sys::Mmap module. So now that you know we're really talking about kernel function names, you won't misuse the names - the names mean very specific things (those functions!)/

      The SysVIPC (msg* functions) give you access to very small amounts of shared memory - usually only a few kilobytes. By contrast, on a 64 bit system, you could mmap in petabytes of data, and on a 32 bit system, you could mmap in gigs.

      Using mmap() to do IPC (inter process communication) is a rotten idea. It's impossible to check for a lock and then lock it if its available in a single operation except using special instructions in the CPU, so without writing XS, you can't do locking operations on data in mmap'd areas. This means that any program that attempts to use mmap'd areas for IPC is going to have race conditions that cause that program to lock up or lose data sooner or later.

      On the other hand, SysV shared memory (those msg* functions and system calls) have a built-in semaphore operation to synchronize access to data by using the CPU's special locking primtives. You could combine this with mmap() to coordinate access to large areas of memory between two processes, but this still sucks. This is only necessary if you absolutely can't make a daemon out of your program or you're trying to wire together a Perl program and a program written in another language (such as C) and you don't want to use something like CORBA (ugh). As ugly as distibuted object systems like CORBA are, they're better than mmap+SysVIPC because they were designed for this purpose.

      If you're sharing data between two forked Perl programs (created with the fork() function/system call), use Coro instead. Coro lets you share data in plain old Perl variables and it lets you switch back and forth between subroutines in a sort of cooperative multithreading (among many other cool things). Threads would be an option but they're too difficult for a novice to use in Perl and they have some serious rough edges right now. Coro also makes race conditions much easier to avoid and it makes the common cases of multithreading much easier to do.

      I used Sys::Mmap to do a multiplayer game of Conway's Game of Life (no, not Damian Conway) as a Perl Mongers presentation on Sys::Mmap. This is a good example use of it. Individual CGI applications modified the state of an image contained in the file. The RAM and disc used to store the file is *exactly* the same RAM that each other instance of the CGI uses and the server daemon users. Every minute or so, the server would do an iteration on the game of life. CGI clients would flip individual bits (in response to clicks on the life board as an image map). In this case, a race condition exists between the time you click something and the client displays the board to you (someone else may have modified the board in the interium) but since the board might change after its displayed to you, this is of little consequence. Using Sys::Mmap to perform read-only operations and take a snap-shot of changing data is also useful. If any more complex data structures than a large bit field needed to be shared, the server would have to be implemented using Coro and HTTP::Server, or with POE, or with threads, and all of the concurrent processes (the server process and connections from each client) would have to be done in the same process. Sys::Mmap is useful where you have raw binary data (no Perl data structures or references) and you want to be able to edit this memory and have it saved to disc as you go. Sys::Mmap is nice for very large files. It's easy to mmap in a multi-gig file on a system with a few hundred megs of RAM. The operation is instant because the data isn't read into memory until that block is actually used. If you tried to slurp up a multi-gig file into a plain Perl scalar on the same system, you'd be waiting a long time. mmap doesn't create copies of the file in memory like reading the file does - instead, it makes the memory an *alias* to the file. These are the places where Sys::Mmap is actually useful. By the way, if you want to see the multiplayer Life game, google for it at site:perldesignpatterns.com (search Google for "site:perldesignpatterns.com conway's game of life multiplayer"). I think that's where I left it.

      Sys::Mmap uses file handles (anonymous or real) to share memory. It's also possible to coordinate access to a real (named) file without it, but each application would have to fflush a lot (yet another system call that Perl gives you access to with a function of the same name) and this would create havok with the disc as data would be repeatedly sent to disk only to be read right back in by another process. Honeywell computers in the early 1970's used to make people do this for IPC and it sucked. With mmap, you never have to wonder whether your data is current and you don't have to beat the snot out of the disc. There are a few clever applications for it and a few traditional ones (dynamic libraries are implemented using mmap, and mmap is used by X Windows to get ahold of video display memory), but by a large, you don't want mmap, you want pipes or some form of multiprogramming (Coro, threads, POE, Event, Stem, etc).

      So let me summerize: don't use Sys::Mmap as you're obviously trying to do IPC, it doesn't work for IPC, and if you made it to, you'd be reinventing a big, ugly, hairy wheel that's best avoided anyway.

      Oh, by the way, if you didn't recognize my name, I'm the Sys::Mmap maintainer, so when I say "don't use Sys::Mmap", you know I'm not baised ;)

      -scott
        By the way, it's an assumption that you're trying to create IPC between two different programs (as opposed to use Sys::Mmap for what it's good for). I know it's annoying to ask for help and have people make assumptions, and the assumptions *are* sometimes wrong, but they're *usually* right. In this case, it's extremely common for people to try to figure out how to create shared variables between threads/tasks/processes/programs. Since there is no clean, easy way to do this that suits all situations, there is no readily forthcoming documentation or short FAQ entries. If you're interested in Coro, and I hope you are, the (brand spankin' new!) (shameless plug alert) book Perl 6 Now: The Core Ideas Illustrated with Perl 5 has a few chapters on it and a chapter on threading in Perl. It deals with some of these ideas I just mentioned here - creating a server process that speaks HTTP (or whatever) and handles requests in parallel, and shares data structures between threads, and wiring together user interfaces with networking modules and so on. mmap() isn't introduced, but that's for a reason ;)

        -scott

        Nice++.

        One question, there is nothing in the docs to say this doesn't run on Win32, but since your around(?)--does it?


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
        Wow, thanks for the "big picture".

        The reason I am toying with it, is because in the Advanced Linux Programming Guide, it says that shared memory segments is the fastest way to communicate between 2 processes. So I wanted to know, if Perl could setup "shared memory segments". Of course I confused things, by jumping to the conclusion that mmap meant memory mapping, and that memory mmaping files is part of that. I now see the difference.

        I would (from my limited knowledge) differ from you on your statement

        The SysVIPC (msg* functions) give you access to very small amounts of shared memory - usually only a few kilobytes. By contrast, on a 64 bit system, you could mmap in petabytes of data, and on a 32 bit system, you could mmap in gigs.

        According to my c experiments, each shared memory segment is limited to whats returned from getpagesize(), and on my system it is 4k. But there dosn't seem to be anything stopping one from creating and attaching to "multiple shared segemnts", and increasing it's working size. Of course you are then required to keep track of the segments yourself.

        Now I notice mozilla uses shared memory segments, as do a few other apps, so it must have some speed benefits over other forms of IPC. Mozilla is using 393kb, of shared memory, on my system.

        Now that my confusion over mmap vs. shared memory is cleared up, the original question still stands....

        Can Perl create or attach to a shared memory segment, as is done in C. If I tried it from Inline::C, would Perl interfere with it's workings?


        I'm not really a human, but I play one on earth. flash japh
        Wow, great summary! One comment. You say:
        Using mmap() to do IPC (inter process communication) is a rotten idea. It's impossible to check for a lock and then lock it if its available in a single operation except using special instructions in the CPU, so without writing XS, you can't do locking operations on data in mmap'd areas. This means that any program that attempts to use mmap'd areas for IPC is going to have race conditions that cause that program to lock up or lose data sooner or later.

        For file-backed mmap, it seems like fcntl range-locking would do the trick, although of course it requires a syscall and so would take longer than a CPU instruction. Is there some reason I haven't thought of that this won't work, or is otherwise a horrible idea?

Re: Mmap question
by zentara (Archbishop) on Jan 31, 2005 at 14:15 UTC
    In case any of you are interested, the c-code I was talking about is in Chapter5 of AdvancedLinuxProgramming, available at alp downloads

    I'm not really a human, but I play one on earth. flash japh

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://426457]
Approved by Old_Gray_Bear
Front-paged by grinder
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2014-09-22 22:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (205 votes), past polls