Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^4: Everything2 github repository and being of value to perlmonks

by JayBonci (Curate)
on Mar 07, 2012 at 21:07 UTC ( [id://958365]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Everything2 github repository and being of value to perlmonks
in thread Everything2 github repository and being of value to perlmonks

It's pretty likely that I could (gently) scrape the site with displaytype=xml, transform the XML, and get most of the data needed to bootstrap a perlmonks clone in the same way as we do with E2, and allow people who are so inclined to work locally or apply proper source control to the software on the site

My concern is whether or not those who are in charge of PM would object to me doing so in an open-source fashion. It'd be great if this sister site could benefit from my tools and later performance improvements. I might also have to adjust the E2 core libraries a bit, but I'd imagine they're 90% similar if you disclude the Everything::Experience bit. Is there a particular contact in the administration who might be most appropriate to direct this kind of a proposal?



    --jaybonci
  • Comment on Re^4: Everything2 github repository and being of value to perlmonks

Replies are listed 'Best First'.
Re^5: Everything2 github repository and being of value to perlmonks (security of obscurity)
by tye (Sage) on Mar 07, 2012 at 22:37 UTC

    I don't mind the idea of scraping out the code. Such should be careful to wait as long between requests as the last request took so that the scraper automatically slows down if the site becomes bogged down.

    I have been personally against making the source code too widely available because the security design is far from stellar and we have had real instances of people getting access to the source code and then using such information to construct attacks against the site.

    The counter argument would be that "surely, making the code widely available would greatly increase the speed with which security problems can be noticed and addressed". Unfortunately, my experience is that giving somebody access to the PerlMonks code has a roughly zero percent chance of them contributing anything to said code.

    Surely, some of the reason for such poor historical return on providing access is due to the quirky (at least!) manner in which the code can be viewed and the significant impediments to contribution. And certainly some of those would/might be addressed by the proposed new method of dissemination.

    But I think there would still be significant impediments to effectively understanding the code and I don't yet see any clear route to this providing significant improvements to effective contribution.

    So my personal assessment is that the likely result would be increased risk to the site.

    However, there has been no effective progress on, for example, creating a "tinkers" group so I find it hard to justify blocking a potential improvement in maintainability given the pronounced stall in the status quo.

    I'd welcome other opinions, particularly on my security concerns... especially from people who actually have a good clue about the security risks of PerlMonks (rare as such people probably are).

    But I think things have dragged on long enough that I would not block such a scheme. I'll just stand by my prediction (which I hope will be proven wrong) on the down side and resign myself to "I told you so" if it comes to that.

    Doing the work to troll the logs for missed exceptions and then actually implementing the "white list" (to replace the "black list") before such a release would make me feel much better about it.

    - tye        

      Tye,

      What I'd like to propose is that we are "in this together". I'm now the owner of E2, so I share real financial liability with any security holes that ecore code in general might possess, plus we are going to share the same classes of problems. I'm actively developing engine improvements, including security, and I'm hoping that you can benefit from my work. For my end, I could use the extra hand in making things work and reviewing changes that go in; my site is writers, not coders.

      My proposal for the path forward looks like this: I'm about to sign up at github for a recurring private repository setup because I don't want to be in the business of providing and maintaining my source control infrastructure, and I need at least one private repository for my configuration information. I need more than one contributor for my primary team, so it's only a $5/mo jump to a 10 collaborator plan with 20 repositories.

      We can work out a trusted team from your group to go through things in the private repos and really prep them (with the ecore tools), and design a path forward to get to a place where you feel comfortable that the code is secure. Ideally we can find a way to merge the two engine bases again and move forward from there.

      I'm hoping to reduce barriers to contribution by reducing the difficulty for development by pre-packaging the environment with Vagrant, hopefully by sharing the same chef recipes as production, only pared down

      Lastly, by trolling the logs, do you mean checking the everything.errlog (or its equivalent), and making sure that errors are squashed?

      Let me know how you feel about it, either here or over email.


          --jaybonci

        That all sounds good.

        Yes, I meant everything.errlog, but I was talking about trolling for something more specific. Long, long ago when I implemented a whitelist of DB columns that can be automatically modified because I find the blacklist approach hopelessly prone to security problems, I didn't actually switch to the whitelist code. But I did make it log whenever something was set via that mechanism so I could later use the log to find things that should be whitelisted (or be set by specific code instead) so switching to the whitelist would not break some important but infrequently used feature.

        - tye        

      ... a roughly zero percent chance ...

      Very roughly indeed. ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://958365]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-24 01:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found