The stupid question is the question not asked PerlMonks

### Re^3: Randomly biased, random numbers.

by salva (Abbot)
 on Dec 07, 2013 at 09:27 UTC ( #1066113=note: print w/replies, xml ) Need Help??

in reply to Re^2: Randomly biased, random numbers.
in thread Randomly biased, random numbers.

That's hard. Mostly because I definitely do not want any formally defined distribution

Download random pictures from the Internet and use them as the base to generate density functions.

You may apply some simple transformations (for instance, dynamic range decompression) to obtain more disparate distributions.

Replies are listed 'Best First'.
Re^4: Randomly biased, random numbers. (A working solution)
by BrowserUk (Pope) on Dec 07, 2013 at 15:43 UTC

However, it turns out to be rather more difficult than I imagined.

I thought of two ways to tackle this approach:

1. Try to derive the points for my test data directly from the randomly chosen images.

It fairly easy to manually pick and apply a few filters to any given image to reduce it to a bunch of discrete pixels -- converting to to grey scale, then explosion followed by embossing works well for many images; as does repeatedly applying a high filter until the number of non-black pixels is reduced to a usable number -- but finding a single sequence of filters that produce good datasets from a wide range of images is very hard.

And even when doing this manually, it is surprising how often that once you succeeded in reducing the image to discrete pixels, they end up being pretty uniformly distributed.

2. Use the color or luminance or hue of the images to weight the picking of 'random' pixels.

This is also quite hard to do other than via the rejection method -- pick a random pixel and reject if the chosen attribute is above or below some cut-off value -- which can be very time consuming.

The only other method I came up with was to construct a 'weight stick'. Eg.

Say this represents the 2D weights map:

```+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 5| 5| 4| 3| 3| 2| 2| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 5|10| 8| 6| 5| 4| 3| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 3| 6| 5| 5| 5| 5| 4| 2| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2| 3| 4| 5| 6| 6| 3| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 1| 2| 3| 5| 5| 4| 3| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 1| 2| 4| 3| 3| 2| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 1| 2| 1| 2| 1| 0|
+--+--+--+--+--+--+--+--+--+--+
| 0| 0| 0| 0| 0| 0| 0| 0| 0| 0|
+--+--+--+--+--+--+--+--+--+--+

Then I build a 1D vector containing the (pixel coordinate pair) x its weight:

```([0,0])x 0, ([0,1])x 0, ([0,2])x 0, ...
([0,1])x 0, ([1,1])x 5, ([2,1])x 5, ([3,1])x 4, ([4,1])x 3, ...
([0,2])x 0, ([1,2])x 5, ([2,2])x 10,([3,2])x 8, ...
...

(I packed these into a scalar to save space.)

Now, to pick pixels, I randomly index into the vector and get one value for every pick. The picking is fast, but the construction is relatively slow. And the higher the range of weight factors, the more memory it takes and the longer it takes to construct, but it works very well.

Once I had this working, I was still not finding a good way to produce good weight maps from randomly chosen images. So then I decided to try and construct good weight maps randomly, but directly.

This took a little trial and error, but I've come up with a method that seems to work quite well. It's still somewhat crude and I need to iron out some edge cases, but I've posted the code below.

To generate the weight maps, I pick a few random points and pick a random weight for those points. Then I grade those high points out to the edges of the area in the x-axis. Then I grade those values to the strips of values created by the other points, or the edges in the y-axis.

Drawn in grey scale, this produces weight maps like these: img img img, which I'm rather pleased with.

Once weight-maps like these have been vectorised and then used to pick a 1000 weight-random pixels, the results look like these:img img img.

The results are everything I could have hoped for; though the currently implementation leaves a lot to be desired - especially the slowness of the vectorisation when higher weight range is used. I'll probably have to move that process and the grading process into C to make this usable.

If you can see improvements to either the grading process -- which currently occasionally produces really bizarre effects for reasons I haven't tracked down -- or ways of speeding up the vectorisation without dropping into C, I'd be very interested to hear them.

The current code:

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Use the color or luminance or hue of the images to weight the picking of 'random' pixels.

This is also quite hard to do other than via the rejection method

There is a much more efficient method. See here, and here.

The trick is to build an 1D array with the accumulated weights @acu. Then, pick random numbers (\$r) in the range [0, \$acu[-1]) and use binary search to look for the index \$ix such that \$acu[\$ix] <= \$r <= \$acu[\$ix + 1].

That's nice. I'll have to try it on a a random selection of images, but it is definitely interesting.

(I'm a bit confused why you are writing back the current pixel with the same color you just read, and then setting the color of the adjacent pixel one row down to the newly calculated color? Which means you'll be re-reading your new values when processing the next row.

And why x => \$w + \$i</x> rather than <c>y => \$j + 1?)

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Create A New User
Node Status?
node history
Node Type: note [id://1066113]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2017-08-20 14:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Who is your favorite scientist and why?

Results (316 votes). Check out past polls.

Notices?