comment on

Indeed. That's pretty similar to the ideas I had -- "Eg. grab a random image, process the image with a filter to reduce it to a just points of a particular color or hue; or maybe use a Conway's Life type process to manipulate the pixels until groups of similar hues the reduce to single points; or a dozen other ideas; and then use those points as my dataset." -- triggered by roboticus' post.

However, it turns out to be rather more difficult than I imagined.

I thought of two ways to tackle this approach:

Try to derive the points for my test data directly from the randomly chosen images.
It fairly easy to manually pick and apply a few filters to any given image to reduce it to a bunch of discrete pixels -- converting to to grey scale, then explosion followed by embossing works well for many images; as does repeatedly applying a high filter until the number of non-black pixels is reduced to a usable number -- but finding a single sequence of filters that produce good datasets from a wide range of images is very hard.
And even when doing this manually, it is surprising how often that once you succeeded in reducing the image to discrete pixels, they end up being pretty uniformly distributed.
Use the color or luminance or hue of the images to weight the picking of 'random' pixels.
This is also quite hard to do other than via the rejection method -- pick a random pixel and reject if the chosen attribute is above or below some cut-off value -- which can be very time consuming.
The only other method I came up with was to construct a 'weight stick'. Eg.
Say this represents the 2D weights map:
```
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 5| 5| 4| 3| 3| 2| 2| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 5|10| 8| 6| 5| 4| 3| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 3| 6| 5| 5| 5| 5| 4| 2| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2| 3| 4| 5| 6| 6| 3| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 1| 2| 3| 5| 5| 4| 3| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 1| 2| 4| 3| 3| 2| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 1| 2| 1| 2| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+--+--+--+--+--+--+--+--+--+--+
[download]
```
Then I build a 1D vector containing the (pixel coordinate pair) x its weight:
```
([0,0])x 0, ([0,1])x 0, ([0,2])x 0, ...
([0,1])x 0, ([1,1])x 5, ([2,1])x 5, ([3,1])x 4, ([4,1])x 3, ...
([0,2])x 0, ([1,2])x 5, ([2,2])x 10,([3,2])x 8, ...
...
[download]
```
(I packed these into a scalar to save space.)
Now, to pick pixels, I randomly index into the vector and get one value for every pick. The picking is fast, but the construction is relatively slow. And the higher the range of weight factors, the more memory it takes and the longer it takes to construct, but it works very well.
Once I had this working, I was still not finding a good way to produce good weight maps from randomly chosen images. So then I decided to try and construct good weight maps randomly, but directly.
This took a little trial and error, but I've come up with a method that seems to work quite well. It's still somewhat crude and I need to iron out some edge cases, but I've posted the code below.

To generate the weight maps, I pick a few random points and pick a random weight for those points. Then I grade those high points out to the edges of the area in the x-axis. Then I grade those values to the strips of values created by the other points, or the edges in the y-axis.

Drawn in grey scale, this produces weight maps like these: img img img, which I'm rather pleased with.

Once weight-maps like these have been vectorised and then used to pick a 1000 weight-random pixels, the results look like these:img img img.

The results are everything I could have hoped for; though the currently implementation leaves a lot to be desired - especially the slowness of the vectorisation when higher weight range is used. I'll probably have to move that process and the grading process into C to make this usable.

If you can see improvements to either the grading process -- which currently occasionally produces really bizarre effects for reasons I haven't tracked down -- or ways of speeding up the vectorisation without dropping into C, I'd be very interested to hear them.

The current code:

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^4: Randomly biased, random numbers. (A working solution) by BrowserUk
in thread Randomly biased, random numbers. by BrowserUk

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


go ahead... be a heretic
	PerlMonks