Vote-based porn detection

Sixtease has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I run a site where I would like to add ads from Google AdSense. They require the site to not contain pornography. However, I host contests in creating images and I want to let the contestants send whatever they please.

So I thought I'd just let people flag their and other's creations as porn or clean and based on the flags, I'd determine whether a user-submitted picture is porn or not. If it was, and the viewing user preferred to see such content, Google ads would not be shown.

Firstly, do you think this is a good way to go?
Secondly, do you know of any such system already programmed, so I wouldn't need to reinvent the wheel?

This is exactly how I plan to do it:

Input

An admin can hard-set the status of a submission to porn / clean.
The author can flag the submission as porn / clean.
The owner of the contest can flag a submission as porn / clean.
A regular user can flag a submission as porn / clean.

Submission states
A submission can be:

Clean
Undecided
Porn

Resolution

By default (with no flags), a submission is clean.
Flag from an admin overrides everything.
Porn flag from submission author or contest owner sets the status to porn.
Porn flags from regular users only set the status to undecided.
Clean flags from regular users together with porn flags from contest owner or submission author set the status to undecided.

When enough submissions are flagged and decided by admins, I could train a statistical model to decide porn / clean status based on non-admin flags.

...so, what do you think?

use strict; use warnings; print "Just Another Perl Hacker\n";

Comment on Vote-based porn detection

Replies are listed 'Best First'.
Re: Vote-based porn detection by Corion (Patriarch) on Dec 30, 2009 at 09:27 UTC
I'd base the "default" setting on the business decision whether you want to maximize the ads shown or minimize your risk. If you want to maximize the ads shown, show them when an image is in (undecided,clean), otherwise, only show ads when an image is clean. Based on the same risk/gain assessment, I'd modify the decision for the hierarchy of flags. For the "pure clean" approach, any contesting flag marks the image as "unclean". I wouldn't let the users contest the porn-opinion of the contest owners and authors. This could be abused by authors tagging all their images as "porn" to subvert ads being shown together with their images, but this could also be abused by users by tagging everything as "porn" to move images into the "unclean" state. I think the risk of authors not wanting ads weighs less than the risk of (anonymous?) users tagging everything as porn.	[reply]
Re^2: Vote-based porn detection by Gavin (Archbishop) on Dec 30, 2009 at 11:09 UTC
This may be of some interest Porn Image filter.	[reply]
Re: Vote-based porn detection by eric256 (Parson) on Dec 30, 2009 at 16:24 UTC
What works will depend significantly on your user base. I've been in forums where voting just resulted in chaos but moderators worked well. Here voting works well and moderators would probably be an issue. So look at your user base and from that decide what will work best. I would definitly weight moderator and author votes much higher than user votes, maybe 10 unapposed votes from users would mark it unclean, where a 10/4 vote would make it undecided? There is probably some threshold in there that you will only find through trial and error, and it will all depend on your user base. I don't know if there is an existing solution, but you might even see about generalizing it more and making it a tag system, where clean/unclean are just two tags, and others could be like/dislike, favorite, colors/subjects/medium/etc. Could make for a very interesting system if everyone can tag photos and different user roles have different weights in tagging. ___________ Eric Hodges	[reply]
Re: Vote-based porn detection by Sixtease (Friar) on Dec 31, 2009 at 07:54 UTC
Thank you all for your helpful advice. @javafan: I was thinking the same but know no language-agnostic forum with such a concentration of people who are experts in all programming-related matters, who will assess a new problem quickly and precisely and who are very helpful. Besides, the site is built in Catalyst and I asked about existing solutions, which I would prefer in Perl. @eric256: Since my site is just starting up and we have about ten regular users at this moment, I'll use the user flags as reports for admin-review. I'll be gathering the flagging actions and when I have a usable data sample, I'll try to machine-learn something from it. If it turns out possible and worthwhile, I'll try to make a general tool out of this when it's ready. Thanks to all for the insight once again... and happy new year! :-) use strict; use warnings; print "Just Another Perl Hacker\n";	[reply]
Re: Vote-based porn detection by Khen1950fx (Canon) on Dec 30, 2009 at 21:32 UTC
For something trainable, how about Voting::Condorcet::RankedPairs? `#!/usr/bin/perl use strict; use warnings; use Data::Dump::Streamer; use Voting::Condorcet::RankedPairs; my $rp = Voting::Condorcet::RankedPairs->new( ordered_input => 1 ); $rp->add('clean', 'porn', 0.7); $rp->add('null', 'undecided', 0.6); my @winners = $rp->strict_winners; my @results = $rp->strict_rankings; my $graph = $rp->graph; print $graph, "\n"; print Dump $graph, "\n";` [download] Update: I got graphviz installed. This is more realtime: `#!/usr/bin/perl use strict; use warnings; use GraphViz::Data::Grapher; use Voting::Condorcet::RankedPairs; my $rp = Voting::Condorcet::RankedPairs->new( ordered_input => 1 ); $rp->add('clean', 'porn', 0.9); $rp->add('null', 'undecided', 0.1); my (@winners) = $rp->strict_winners; my (@results) = $rp->strict_rankings; my $graph = $rp->graph; $rp->compile; print $graph, "\n"; my $grapher = GraphViz::Data::Grapher->new(@winners); print $grapher->as_svg, "\n";` [download]	[reply] [d/l] [select]
Re: Vote-based porn detection by Fox (Pilgrim) on Jan 04, 2010 at 18:30 UTC
your description made me remember of danbooru style boards their rating system is pretty much what you asked. some of them, including danbooru itself, are open source, so you could look at the code and make your own perl version.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.


more useful options
	PerlMonks