Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

snakes and ladders

by Logicus
on Aug 25, 2011 at 03:10 UTC ( #922257=perlmeditation: print w/ replies, xml ) Need Help??

This is going to be a long post, so if you don't have time or are not interested, please move along. Anyone making any personal references, trollish remarks, or throwing ad-hominems will be ignored.

The problem

About 5 years ago I got into perl programming when I reverse engineered "formmail.pl", which I found on matt's script archive. At that time I was completely ignorant of the world of Perl, but recognised the syntax enough from prior experience with C to start coding.

Over the following months, I found places like tizag.com/perlT which gave me enough information and enough answers to my questions to build a declarative programming environment which I called aXML. (the name is taken but I have yet to come up with a better one)

Over many successive iterations, and many many months of hard work, I built a system which allowed for the embedding of declarative tags into HTML, which fired off subroutines, SSI style. I overcame many design obstacles, one at a time by sheer brute effort and many many iterations.

Eventually I settled upon a syntax for the declarative tags which allowed for attribute parameters, and process order control, and was flexible enough to implement complex systems in, building them up from stacking the tags together using the ruleset.

The system worked. All code logic was separate from the display layer, and I was able to build viable applications from it which were stable and reliable.

There was just one problem... Efficiency. Whilst from an end developers view it is lightening quick to throw a template together with a few declarations which have already been coded to create something new, the system I built was just not fast enough.

I became concerned, perhaps a little overly so, with rendering time. I was sure there had to be a faster way to achieve the same results, and to use the same markup files unaltered to drive the thing.

It was about this time that I first discovered Perl monks, and the wider Perl community. Being new and by nature quite a nervous sort of person, I put up a node where I tried to explain what I had been upto, and the problem I was trying to solve, but I worded in a way that was not very well thought out.

Some people were very kind in their response, especially the user by the name of "grandfather", whilst others were lets say, less than friendly.

The problem was I didn't know how to describe the problem properly, I didn't know what "declarative" programming was, or anything much about the regular methods used as I had come into the world of Perl from a rather odd angle and was not a classical Perl programmer at all.

Coupled with that I also tend to take the words of strangers as being correct and knowledgeable, a personal bias which I am slowly overcoming as I discover more and I am therefore better able to separate the crap-talkers from those with real insight. Man it came as a shock to me just how many people actually talk out of their arses as if they are all knowing... but that's people I guess.

Anyway I digress.

So the system I lovingly built at lengthy effort and have used in many small production sites, was blown out as worthless. But it isn't worthless, it's just inefficient from a processor standpoint. It's very efficient from an end programmer standpoint as complex declarations are simple to build and just work, thus I can put together complex systems like a social network or a forum very quickly and easily.

So I went off on my own again, learnt more things, put together the huge and scattered puzzle that is Perl, and learnt just how much of a gap in efficiency my system has to others. I figured there must be a way of doing it more efficiently, so I started working on trying to figure out a compiler.

Now I know _nothing_ about compilers. I've got quite good with complex regex's, and the old adage comes to mind, if your only tool is a regex, everything looks like a string... or something like that.

I spent a long time playing around with different ways of trying to get it to work, but none of my attempts ever resulted in a system which had the same flexibility as the original. Something always got broken, there were always caveats which I just couldn't live with. After many many months of working at it I finally conceeded to myself that I needed help, that if I kept on trying to figure it out by myself I could be at it for another 10 years...

The one small light at the end of my dismal tunnel was that since I had started the project, the availability of processing power had inevitably increased. Most of the pages I was serving were being rendered in less than 0.1 seconds so I figured that eventually the efficiency issue would just disappear by itself.

I wasn't happy with that notion, I am 100% certain that the system can be refactored to be at least 30x faster than it is. I just don't know how.

So after 4 years of going round in circles, and constantly coming up against the same old problems, I decided the time had come to "eat humble pie" and return to perlmonks to ask for advice.

Apparently some of the best perl hackers in the world had a look at it, but only the tiny minority of responses contained any useful information, the rest consisted of a varying set of; insults, scorn, put downs, and suggestions that the whole thing was a gigantic waste of time, and that I ought to "know my place" as a novice and do as I am told.

Something sparked me off perhaps indignation at people's attitudes, the vague and aloof way they brushed off my problem without really looking into it, perhaps it was the way that they accused me of being delusional or of being incredibly egotistical.... no... I just have a system of declarative programming that I really like and need to make more efficient. That is all.

So I fought back, exchanged rudeness for rudeness, told people where to get the f*** off. Because you see, having had the experience of working within the system I built, I know it is both useful and powerful. It's just not efficient enough, and that is the only problem with it, and I don't want to change _ANYTHING_ else about it.

Then after several weeks of arguing, and upsetting bystanders who then also joined the fray and had the fun of telling me what a deviant I am that needs to eat humble pie and do as I'm told by my superiors and betters, I broke down. Literally completely broke down.

So now here I am at square one, back where I started, I either have to wait for moores law to solve my dilemma, completely start from scratch and learn something I don't want to learn just because we are still using stone-age computer systems, or I have to find an efficient solution to the compiler problem.

I'm not giving up. It is not heresy and I will not recant. One attitude that has always stood me well in the troubles of my life has been my indomitable will to keep on moving forward, despite any setbacks and problems.

Help would be very much appreciated.

P.S I put "readmore" tags around the whole post but they don't seem to be working...

Comment on snakes and ladders
Re: snakes and ladders
by chromatic (Archbishop) on Aug 25, 2011 at 03:44 UTC
    Help would be very much appreciated.

    Two things will help you more than anything else.

    The first is a decent understanding of algorithmic complexity, colloquially called "big O notation". The value of this is being able to analyze a piece of code and have a sense of how well or poorly it will scale. This is technically a science, but there's an art to it, and the basic rule is "How much work does it take to do things this way?" (The corollary is "How much data do I expect to process?" If that's small, a big big O doesn't matter very much.)

    The second is how to combine an efficient tokenizer with a finite state machine. As I've mentioned before, this is an important concept covered in SICP and HOP. In short, you want to process your input document once, probably character by character (and you can make that more efficient if you want), to build an intermediary data structure which represents your document. You can do this even if you have parts of the code you can only evaluate fully after you've processed previous parts of the document (it's how Perl's eval works, after all).

    I won't promise you that this will make your code beloved to other hackers (what I've seen doesn't fit my needs from a human factors perspective, but I admit I don't have the experience with it you do), but I can promise you that this is the technique favored by compiler writers as reasonably straightforward but efficient and effective. What you're doing is, essentially, writing a compiler.

      >Big O

      Yup, I get that, that's why the system will spit out "hello world" in like 0.03 seconds, but take 0.22 to produce the 400 line output for the forum sections page for instance. More data = more computing = slower result.

      >Efficient tokenizer with a finite state machine.

      So I have to treat it almost like a de-compression algorithm? Take the input letter by letter and inflate it up into something which can then be stored and processed quickly... I think I can see how that works.

      <use><qd>action</></> becomes something like; [TOKEN name="use"] [TOKEN name="qd"] [TOKEN value="action"] [ENDT] [ENDT] then process line by line? I can feel a stack is going to be needed. I + think I need to have a play with getting results from the expected t +oken format (written by hand) so I can be sure how the system will wo +rk post-tokenisation. (then write the tokeniser)
      >beloved to other hackers

      I'm not the sort of person who craves or needs adoration or respect of peers. I'm actually quite a reserved and quiet person that tends to try and blend in where possible, (believe it or not) so I won't be worrying about that very much... besides, this system isn't for high level hackers, it's for dummies like me to make work/life easier, and measured by that standard it works very well!

        So I just did a more complex example by hand, now I need to figure out a way to iterate across such a complex data structure :/

        # <html lang="<qd>lang</>"> # (sql mode="mask" table="users") # <query> # SELECT * FROM users # </> # <mask> # Profile : [link action="profile" # username="<d>username</>" # ]<d>username</>[/link] # <br/> # </> # (/) # </> #TODO: translate above, into this : @_ = ( { type => '<', value => 'html', attr => { lang => ( { type => '<', value => 'qd' }, { type => 'data', value => 'lang' }, { type => '>' } ) } }, { type => '(', value => 'sql', attr => { mode => 'mask', table => 'users' } }, { type => '<', value => 'query' }, { type => 'data', value => 'SELECT * FROM users' }, { type => '>' }, { type => '<', value => 'mask' }, { type => 'd', value => 'Profile : ' }, { type => '[', value => 'link', attr => { action => 'profile', username => ( { type => '<', value => 'd' + }, { type => 'data', value => +'username' }, { type => '>' } ) } }, { type => '<', value => 'd' }, { type => 'data', value => 'username' }, { type => '>' }, { type => ']' }, { type => 'data', value => '<br/>' }, { type => '>' }, { type => ')' }, { type => '>' } );
        What's going to make life fun is the recursive nature of the thing, each line may have attr's which then may have muliple lines to describe them.
        ... then process line by line?

        As a proper data structure, not plain text. A data structure has, well, structure. You need that structure to identify which types of tokens you have and what they mean.

        I tend to use objects for this, but an ad hoc hash will serve as well for your experiments.

        >Big O

        Yup, I get that, that's why the system will spit out "hello world" in like 0.03 seconds, but take 0.22 to produce the 400 line output for the forum sections page for instance. More data = more computing = slower result.

        I'm afraid you're not getting it at all.

        Complexity is not about "there's more input, so it will take longer", it's the indication how it will scale. Does it scale lineary, quadratic, logarithmic, etc.

Re: snakes and ladders
by SuicideJunkie (Priest) on Aug 25, 2011 at 13:19 UTC
    Coupled with that I also tend to take the words of strangers as being correct and knowledgeable, a personal bias which I am slowly overcoming as I discover more and I am therefore better able to separate the crap-talkers from those with real insight. Man it came as a shock to me just how many people actually talk out of their arses as if they are all knowing... but that's people I guess.

    In case you haven't heard of it, see Sturgeon's Law: 90% of everything is crap. Including 90% of what you and I write.

    The trick to not looking like a fool is to realize when what you've written is crap as you hit the preview button, rather than just as you hit the create button like so many new members :)

    I suspect the trick to looking wise is to realize the 4 in 5 cases* when what you've edited is still crap, and then deciding not to post it after all.

    (*) 0.9 * 0.9 = 0.81, or roughly 4 out of 5
      In case you haven't heard of it, see Sturgeon's Law: 90% of everything is crap.
      When I say that about CPAN, there's always a large mob of Perlmonks ready to lynch me.

        No synonym of platitude is "reproducable and statistically accepted evidence".

        Put another way, the barriers to uploading with PAUSE are sufficiently high that the awful on CPAN looks less than 50% to me.

      Yes, and I have noticed that statues always look wiser than real men. At home it's my salt shaker that hits the jackpot, never says a word, the wisest one.
Re: snakes and ladders
by armstd (Friar) on Aug 25, 2011 at 15:34 UTC

    "The definition of insanity is trying the same thing over and over and expecting different results." Not giving up...doesn't mean just try the same thing again.

    Eating humble pie... means acknowledging that people were trying to help you with constructive criticism, putting time and effort into writing something to help you the best way they knew how. It means admitting that by ignoring their attempts to help, maybe you were disrespecting their experience and effort. So maybe that in turn earned some unfriendly responses, as is the nature of things.

    It means looking back and really considering what was said from others perspectives, which I really don't see here. That's where "egotistical" comes from. Refusing to acknowledge others perspectives might be different than yours. Perhaps, just perhaps, "you got as good as you gave" instead of just "you gave as good as you got". Maybe, just maybe others took your comments, and ignoring their attempts to help as disrespectful and were insulted by you.

    Being humble and knowing your place is not about doing what you're told and blindly accepting the criticism of others as correct. It's about giving due consideration as to whether it might be correct.

    When they say "I learned this crap in CS101 and it worked for me!", then instead of considering that as an insult to your degree-less status, you could consider that "I should find a CS101 class and get the benefit they're suggesting they did and it will help me". Lots of schools offer night classes, part-time, non-matriculated, web-delivered... seriously. Opportunities to learn what you need to know are out there. The same books used in those classes are available in local libraries, bookstores, Safari, etc.

    Perlmonks is not going to teach you everything you need to solve your issues. That's impossible. At best we can only hope to give an example or two and hope to pique your interest enough that you go off and learn what you really need in a more constructive environment, like a classroom or a library. There's no silver bullet here.

    Sure, some of the feedback you got was less constructive than other feedback. That will always be the case. If you can't be humble enough to just ignore it, and focus on getting what you really came for, then you're the one that loses in the end.

    In all of this post, I don't see you taking any responsibility at all for possibly misunderstanding others intentions before. Or insulting and disrespecting others knowledge and experience from having done such things before. You only seem to apologize for being new, for not knowing what you need to know, for not kissing more arse, for being indignant at others insults. For being a hapless victim, and not an instigator. That is egotistical, that is not humble.

    So, my best advice, the one thing that will help you the most here, more than any technical advice, is "learn to work with others." You will always get mixed responses, and if you cannot deal with criticism, advice you don't like, or even destructive responses, then you really shouldn't participate in any public forum. You will get all of the above. Always. Online or off. Suck it up, be humble, and deal with it. Focus on what you came for, and you might just get that too if you're lucky.

    Having written this, I must be insane. I know it's all been said before.

    --Dave

      Do you always expect the same result when talking with someone?

      Alfred Korzybski's manifest on General Semantics is precisely entitled "Science and Sanity".

        You make an excellent point.

        --Dave

Re: snakes and ladders
by emilbarton (Scribe) on Aug 25, 2011 at 18:03 UTC

    A troll is not something one can decide to become, it is a social function, and if you suffer from embodying this function, you should become aware of its natural necessity: without troll there would be no mob and thence no fun.

    So rejoice: it's not your fault. Accept: Troll pride!

    On the level of programming, I think that you could perhaps maintain a page somewhere on the web, where people could find an example of what your code does so anyone could understand your lament without having to scan the Perl monks site for older messages (with dead links). But they told you already...

    Finally I can only insist on keeping indulgent for all the professional programmers who manifest their anger or their superiority at your imaginativeness: you might well be a misunderstood genius. Personnally I have found a new information paradigm that would change the face of the world... Unfortunately I'm not sure, it could simply be working, without being too stupid nor too useful, nor too clever either, and it's really badly written.

    Indulgence is required also because of the benefits you can draw from the situation: don't forget that social welfare is often forbidden to your employed contradictors: think social! In many countries, trollness can ensure a monthly rent that would at least, render life feasible. You see: there's always light! Do as I say, and youl'll come to appreciate your condition: Troll solidarity!

      Imho, one shouldn't have to be a genius to receive common courtesy... anyhoo...

      If you want I'll give you the complete sourcecode to my forum system, it's too slow for a big audience (unless you happen to have access to big iron...) but I'm hoping to fix that issue with this compiler.

      I'm also hoping to keep 100% compatibility with the aXML system declaration and presentation markup docs, and as an ambition I'm hoping to compile them into a well optimised PSGI app.

      I would be interested in seeing your heretical code, I'm not bothered at all by how good the implementation is, or how fast it is as long as it works correctly since faster & neater solutions can come later if the idea behind it is good and worth developing.

      peace.

        Thank you for your answer. My code is not yet to be shown to anyone (will it ever be?) but I'll work on it, I'll try to improve functions commonly expected in the field (data management) and one day, as soon as I feel confident enough, I'll ask for evaluation.

        I may seem a little sleepy, but I did not understand yet the ins and outs of your work. Is your system comparable to CMS, as Drupal or Moodle? What is your intention? Are there new kinds of web interactions that you wouldn't like to disclose yet (making thus dialog difficult)? Sorry if I'm asking you to repeat (in few words) things already said but it's not always clear information that one digs out from old threads.

        You could probably get a huge performance gain without changing any code—if it’s presently well behaved and scoped—just by going persistent with running it. This is mostly how everyone does it and why understanding prior art is so important and will save you much more time than it costs to learn. Template::Toolkit and Catalyst are both fairly slow for example. Catalyst is all but useless run as CGI and TT2 compiles and caches itself to improve performance and only resorts to recompiling from the original templates when they have changed. Even the check interval for changes is tunable.

        Run any one of your CGIs like so and visit it at port 8080 on whatever host–

        starman --listen :8080 -MPlack::App::WrapCGI \ -e 'Plack::App::WrapCGI->new( script => "your.cgi" )'

        IMNSHO, one should not claim to be a genius even if he happens to be (and sorry I am pretty sure you are not) and if he/she does, he/she should expect his/her geniality to be put to a test. You came here preaching your uber-genial miracle without even being able to explain what the heck it is. What did you expect?

        Jenda
        Enoch was right!
        Enjoy the last years of Rome.

Re: snakes and ladders
by jdrago999 (Pilgrim) on Aug 25, 2011 at 22:23 UTC

    ...or I have to find an efficient solution to the compiler problem.

    If parsing the document is slow, then save the results. After you have parsed the document, store the "compiled" version on disk. Next time a request for the same document comes in, look and see if it has changed since the last time you "compiled" it. If it's newer, then recompile (of course) but if nothing has changed then use the pre-compiled version. You should get a good performance boost.

    ...but only if compilation is actually your performance bottleneck. You could take a look at Devel::NYTProf - a code profiler for perl - and it will help you see what parts of your code are taking longer than other parts. It takes some tinkering to master, but unless you use it (or another good profiler like the venerable Devel::DProf) you might only be guessing about the source of performance issues.

      Yah man, the difficulty I'm finding is in writing the compiler. As Larry say's programs that write programs are happy programs and I like that concept a lot. It really doesn't matter how long the compiler takes to run as it will only be run once per source code update.

      I think I've also found a much faster way to parse the documents which I'm also working on.

      I would like to end up with an aXML > PSGI compiler such that even the noobiest of nooby noobs who ever did any noobying around a Perl script can use it to write simple declarative apps and enjoy immense performance out of the box. It might be a while before that happens, and something else might come along in the meantime but yeh that's where this seems to be going atm.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://922257]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2014-12-27 05:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls