http://www.perlmonks.org?node_id=1015537


in reply to Re^3: Is there a module for object-oriented substring handling/substitution?
in thread Is there a module for object-oriented substring handling/substitution?

I seemed to have caused some confusion with my previous post (sorry about that):

When I wrote about playing nice with page edits by humans, I did not mean actual interactivity while the script is running.
The script performs a self-contained, non-interactive operation: Pull the page source into a string, update specific values, submit the updated page source back to the server, exit.
The human editing happens in between multiple such runs of the script, on the wiki itself.

  • Comment on Re^4: Is there a module for object-oriented substring handling/substitution?

Replies are listed 'Best First'.
Re^5: Is there a module for object-oriented substring handling/substitution?
by LanX (Saint) on Jan 26, 2013 at 22:08 UTC
    > I did not mean actual interactivity while the script is running.

    So my first statement still holds, you need to parse the content into a tree, manipulate some nodes and export again as markup.

    Wiki-syntax is nested, e.g. a table entry can be bold or a link!

    No "object-oriented substrings" needed!

    Cheers Rolf

      But then I could not guarantee that the output is exactly the same as the input outside of the specific values being updated.

      DOM-tree<-->wiki-syntax is a one-to-many relationship, for example each of the following is a valid wiki-syntax representation of a table row containing 3 cells with the contents A, B, and C:

      |- | A || B || C

      |- | A | B | C

      <tr> <td>A</td> <td>B</td> <td>C</td> </tr>

      Also, I doubt that writing a full-blown parser and serializer would result in less work and less code than a regex-based solutions supported by linked substrings.

        But then I could not guarantee that the output is exactly the same as the input outside of the specific values being updated.

        :) Sure you could. While HTML DOM doesn't guarantee same representation, there is no reason your DOM couldn't -- hey, if PPI can do it, you can too :) whitespace can be significant if you make it so