Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Vote-based porn detection

by Sixtease (Friar)
on Dec 30, 2009 at 09:15 UTC ( [id://814895]=perlquestion: print w/replies, xml ) Need Help??

Sixtease has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I run a site where I would like to add ads from Google AdSense. They require the site to not contain pornography. However, I host contests in creating images and I want to let the contestants send whatever they please.

So I thought I'd just let people flag their and other's creations as porn or clean and based on the flags, I'd determine whether a user-submitted picture is porn or not. If it was, and the viewing user preferred to see such content, Google ads would not be shown.

Firstly, do you think this is a good way to go?
Secondly, do you know of any such system already programmed, so I wouldn't need to reinvent the wheel?

This is exactly how I plan to do it:

Input

  1. An admin can hard-set the status of a submission to porn / clean.
  2. The author can flag the submission as porn / clean.
  3. The owner of the contest can flag a submission as porn / clean.
  4. A regular user can flag a submission as porn / clean.

Submission states
A submission can be:

  1. Clean
  2. Undecided
  3. Porn

Resolution

  • By default (with no flags), a submission is clean.
  • Flag from an admin overrides everything.
  • Porn flag from submission author or contest owner sets the status to porn.
  • Porn flags from regular users only set the status to undecided.
  • Clean flags from regular users together with porn flags from contest owner or submission author set the status to undecided.

When enough submissions are flagged and decided by admins, I could train a statistical model to decide porn / clean status based on non-admin flags.

...so, what do you think?

use strict; use warnings; print "Just Another Perl Hacker\n";

Replies are listed 'Best First'.
Re: Vote-based porn detection
by Corion (Patriarch) on Dec 30, 2009 at 09:27 UTC

    I'd base the "default" setting on the business decision whether you want to maximize the ads shown or minimize your risk. If you want to maximize the ads shown, show them when an image is in (undecided,clean), otherwise, only show ads when an image is clean.

    Based on the same risk/gain assessment, I'd modify the decision for the hierarchy of flags. For the "pure clean" approach, any contesting flag marks the image as "unclean".

    I wouldn't let the users contest the porn-opinion of the contest owners and authors. This could be abused by authors tagging all their images as "porn" to subvert ads being shown together with their images, but this could also be abused by users by tagging everything as "porn" to move images into the "unclean" state. I think the risk of authors not wanting ads weighs less than the risk of (anonymous?) users tagging everything as porn.

Re: Vote-based porn detection
by eric256 (Parson) on Dec 30, 2009 at 16:24 UTC

    What works will depend significantly on your user base. I've been in forums where voting just resulted in chaos but moderators worked well. Here voting works well and moderators would probably be an issue. So look at your user base and from that decide what will work best. I would definitly weight moderator and author votes much higher than user votes, maybe 10 unapposed votes from users would mark it unclean, where a 10/4 vote would make it undecided? There is probably some threshold in there that you will only find through trial and error, and it will all depend on your user base.

    I don't know if there is an existing solution, but you might even see about generalizing it more and making it a tag system, where clean/unclean are just two tags, and others could be like/dislike, favorite, colors/subjects/medium/etc. Could make for a very interesting system if everyone can tag photos and different user roles have different weights in tagging.


    ___________
    Eric Hodges
Re: Vote-based porn detection
by Sixtease (Friar) on Dec 31, 2009 at 07:54 UTC

    Thank you all for your helpful advice.

    @javafan: I was thinking the same but know no language-agnostic forum with such a concentration of people who are experts in all programming-related matters, who will assess a new problem quickly and precisely and who are very helpful. Besides, the site is built in Catalyst and I asked about existing solutions, which I would prefer in Perl.

    @eric256: Since my site is just starting up and we have about ten regular users at this moment, I'll use the user flags as reports for admin-review. I'll be gathering the flagging actions and when I have a usable data sample, I'll try to machine-learn something from it.

    If it turns out possible and worthwhile, I'll try to make a general tool out of this when it's ready.

    Thanks to all for the insight once again... and happy new year! :-)

    use strict; use warnings; print "Just Another Perl Hacker\n";
Re: Vote-based porn detection
by Khen1950fx (Canon) on Dec 30, 2009 at 21:32 UTC
    For something trainable, how about Voting::Condorcet::RankedPairs?
    #!/usr/bin/perl use strict; use warnings; use Data::Dump::Streamer; use Voting::Condorcet::RankedPairs; my $rp = Voting::Condorcet::RankedPairs->new( ordered_input => 1 ); $rp->add('clean', 'porn', 0.7); $rp->add('null', 'undecided', 0.6); my @winners = $rp->strict_winners; my @results = $rp->strict_rankings; my $graph = $rp->graph; print $graph, "\n"; print Dump $graph, "\n";
    Update: I got graphviz installed. This is more realtime:
    #!/usr/bin/perl use strict; use warnings; use GraphViz::Data::Grapher; use Voting::Condorcet::RankedPairs; my $rp = Voting::Condorcet::RankedPairs->new( ordered_input => 1 ); $rp->add('clean', 'porn', 0.9); $rp->add('null', 'undecided', 0.1); my (@winners) = $rp->strict_winners; my (@results) = $rp->strict_rankings; my $graph = $rp->graph; $rp->compile; print $graph, "\n"; my $grapher = GraphViz::Data::Grapher->new(@winners); print $grapher->as_svg, "\n";
Re: Vote-based porn detection
by Fox (Pilgrim) on Jan 04, 2010 at 18:30 UTC
    your description made me remember of danbooru style boards

    their rating system is pretty much what you asked.
    some of them, including danbooru itself, are open source, so you could look at the code and make your own perl version.
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://814895]
Approved by Old_Gray_Bear
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-04-20 00:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found