Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

M alias MUMPS:

All jokes you could invent about naming a programming language after a viral disease can't even come close to how sick MUMPS is.

On the other hand, I see several features in MUMPS that were re-invented decades later. MUMPS has trees that can be used as multi-level associative arrays, and unlike Perl's hases, the associative arrays are even sorted. MUMPS has regular expressions, string eval, exception handling, locking, local, post-conditions, automatic converting from string to number and back, and some other features that you would not expect from a language designed to run on a PDP-7. The most important feature are "globals", structured variables available in all programs, stored on disk. MUMPS fans call that feature a database. Modern languages would perhaps call them persistent super-globals. Don't confuse them with ordinary global variables that are kept separately for each session in memory.

I see MUMPS as wild mix of Perl, DBM files, home computer BASIC, a macro assembler, and a big heap of punch cards. (Actually, MUMPS does not use punch cards, but "modern" VT420 terminals or VT420 emulations.)

The code below runs on MSM (Micronetics Standard M) Version 4.4.1. I think it could run on other MUMPS implementations with minor changes.

Note that this example uses quite modern constructs (for MUMPS), like user defined functions (returning a value) instead of procedures (returning no value, but possibly modifying global variables), arguments for functions (instead of using global variables), private variables (instead, as you may have guessed, using global variables). I just can't get used to stuffing everything into global variables. And my co-workers can't get used to using local variables. "We use global variables since three decades, and we never had problems with them." (Except for those "rare" cases once a month when global variables were accidentally overwritten.)

The code also is quite verbose for a MUMPS program. Most code I see at work has each and every line stuffed to the maximum allowed (because those ancient PDP machines executed code faster when it was stuffed into a single line), variable names tend to be as short as possible (why use all eight significant characters when you can use just two or three?), and comments are used as a poor replacement for SVN. I think my co-workers could write functionally equivalent code with half of the lines.

ROSETTA ;Rosetta PGA-TRAM Example; [ 08/04/2011 4:39 PM ] S TESTDATA="XLII,LXIX,mi" F I=1:1 S R=$P(TESTDATA,",",I) Q:R="" D .W R,": ",$$ROM2DEC(R),! Q REDUCE(CALLBACK,LIST) ; N (CALLBACK,LIST) I $O(LIST(""))="" Q 0 I $O(LIST(1))="" Q LIST(1) S A=LIST(1) F I=2:1:$O(LIST(""),-1) D .S B=LIST(I) .S @("A=$$"_CALLBACK_"(A,B)") Q A HELPER(A,B) ; N (A,B) Q A+B-(A#B*2) ROM2DEC(X) ; N (X) S RTOA("M")=1000,RTOA("D")=500,RTOA("C")=100,RTOA("L")=50 S RTOA("X")=10,RTOA("V")=5,RTOA("I")=1 S X=$TR(X,"abcdefghijklmnopqrstuvwxyz","ABCDEFGHIJKLMNOPQRSTUV +WXYZ") F I=1:1:$L(X) S LIST(I)=RTOA($E(X,I)) Q $$REDUCE("HELPER",.LIST)

Explaining every aspect would take days, so I have to omit many details.

  • MUMPS is generally interpreted, MSM uses some tricks (pre-compiling) to speed up the interpreter, but that is completely transparent to the programmer.
  • Commands and build-in functions can be abbreviated to just one or two letters, and they typically are. MUMPS code written only with unabbreviated names exists only in educational books.
  • White space is relevant, because the parser had to be simple. Each and every command is followed by a single space separating command token and arguments token. The arguments are followed by one or more space character. So, a command without arguments is followed by two spaces, not just one.
  • A line starts with an optional numeric or alpanumeric label, followed by whitespace (typically a TAB), followed by zero or more commands, optionally followed by a comment starting with a semicolon.
  • Blocks are not available. The scope of FOR, IF, and ELSE is limited to a single line. Recent versions of MUMPS allow pseudo-blocks, indented with dots, but they are actually anonymous subroutines called by an argument-less DO command. The important difference is the behaviour of the QUIT command. QUIT in the FOR line aborts the for loop, QUIT in the pseudo-block just leaves the anonymous subroutine and thus starts the next iteration.
  • All expressions are evaluated strictly from left to right, so WRITE 1+2*3 writes 9, not 7. If you want a different evaluation order, you have to use brackets: WRITE 1+(2*3) writes 7, as expected.
  • While MUMPS has associative arrays, it lacks regular arrays. You can either use a character-delimited string (limited to 255 or 511 characters) and the $PIECE function (think of it as split when used on the RHS and as a combination of split and join when used on the LHS), or you can use an associative array with numeric keys and the $ORDER function (keys, each).

The program ROSETTA starts with the main routine, the first line contains the program name as a label and a comment, no executable code (by convention). Note that the timestamp is automatically updated by the editor (poor man's SVN). The second line assigns the three test cases to the string TESTDATA. The third line is a for loop, starting with I=1, incrementing I by 1 in every loop, with no upper limit. Inside the loop, R is set to the I-th piece of TESTDATA using the $PIECE function (pieces are separated with commas). Still inside the loop, the loop is aborted using the QUIT command if R is empty (because no more pieces are available), else the anonymous subroutine starting (and ending) in the next line is executed for each iteration. The fourth line writes the value of R, a colon and a space, the result of the ROM2DEC function invoked with the argument R, and a newline. The fifth line aborts the program.

The REDUCE label defines a function or procedure with arguments. Arguments are handled similar to Javascript in that each argument becomes a variable. All arguments are passed by value, unless the caller explicitly passes a variable by reference (prefixing its name with a dot, see last line). To access trees, you have to pass them by reference, else you see only the value of the tree's root element. I prefer not to have MUMPS code after a function name, hence the empty comment. The next line calls the NEW command to make all variables (except for the on-disk globals, see top of this posting) except those in brackets invisible. Returning from the function will destroy this new set of variables, and the old variables will be visible again. As I don't need any other variables, I just keep the arguments. Line three checks for the first key of LIST using the $ORDER function, if it returns an empty string (end-of-list), the list must be empty and REDUCE returns 0. (Note that MUMPS has a kind of undef, but it is most times a special case and causes errors. You can not simply return an undefined value in MUMPS.) The next IF command checks for the key following the key 1 in the LIST, if end-of-list occured, REDUCE returns the first list element. (MUMPS has no conventions for the index of the first array element, but 1 is common, so I choose a 1-based array.) The next lines are quite boring, except for two constructs: $ORDER(LIST(""),-1) is a "new" feature, it returns the last key of LIST ($ORDER(LIST("")) returns the first key). And @() is the indirection operator, the little brother of string eval. The line S @("A=$$"_CALLBACK_"(A,B)") is expanded at runtime to SET A=$$callback_value(A,B), inside the ROSETTA program, CALLBACK always contains "HELPER", so the line is expanded to SET A=$$HELPER(A,B). Note that due to the minimalistic parser and strict left-to-right evaluation, the entire argument must be given using the indirection operator, S A=@("$$"_CALLBACK_"(A,B)") does not work.

The HELPER label defined a second function, it is the equivalent of the code block passed to List::Util::reduce(). Again, NEW is used to generate a new, temporary set of variables (so that the variables inside REDUCE are not overwritten), and as in REDUCE, the QUIT is used to return the value of an expression. Note that brackets are required due to the left-to-right evaluation order. # is the modulo operator.

The last function is ROM2DEC, with the usual prolog to generate a new, clean set of variables. RTOA is used as an associative array. Note that you have to assign each element separately, there is no shortcut notation. The longest line is the MUMPS equivalent to $x=~tr/[a-z]/[A-Z];, poor man's lc() uc(). The for loop in the next line starts with I=1, increments by 1, and terminates automatically after reaching $LENGTH(X). Inside the loop, a list representing the value of each roman digit in X is built. $EXTRACT(X,I) is equivalent to substr($X,$I,1). The for loop replaces the Perl fragment map { $rtoa{$_} } split//. ROM2DEC ends with a tail call to REDUCE, calculating the value of X from the digit values in LIST.

And this is how you run ROSETTA from the programmer prompt:

>D ^ROSETTA XLII: 42 LXIX: 69 mi: 1001 >

You can also call the various functions inside the program:


Yes, MUMPS programs may have more than one enty point. There is no difference between a program and a library. You get used to that, like you get used to many other ugly tricks. Most notably that variables survive a program exit and are available for the next program.


Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

In reply to Re: Rosetta PGA-TRAM by afoken
in thread Rosetta PGA-TRAM by eyepopslikeamosquito

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2021-06-18 17:52 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (89 votes). Check out past polls.