Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

A data selection problem(in3D).

by BrowserUk (Pope)
on Apr 06, 2017 at 08:21 UTC ( #1187228=perlquestion: print w/replies, xml ) Need Help??

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

In this image the 62770 non-CYAN (rgb(255,0,255)) pixels above the red line constitute the dataset for this problem. They are ordered according to their HSV values, with the Y-axis being hsv-V(0..255), the X-axis(0..255) the hsv-S value within 31, 256x256 hsv-H value frames distributed along the X-axis.

I can make the same dataset available as text (rgb & hsv triples) should anyone be interested; but it is too big to post here.

The fully populated hsv-H frames below the red line are for comparison and discussion only, and are not a part of the dataset.

The data are a uniqified (and sub-setted; only hsv-H frames 30..60 inclusive) of the pixel colors found in this image picked for its aesthetics from the net.

(Using this code:

My goal is to reduce those 62770 pixel values to a coherent, progressive, aesthetically pleasing 'gold scale' (after 'greyscale) of 256 values running from black to white, but encompassing as much of the range (nebulous term) of the full dataset as possible. An altogether unsatisfactory description, but I haven't hit upon anything better yet.

If you look at the data pixels in the middle to lower right corners of the left-most frames, any gold color is quite to very dark, coming from the shadowed parts of the source image. And the few pixels in the right most frames anything that isn't (close to) white, is a quite bright, pale lemon yellow; coming from the reflective highlights in the original image.

I want to select 256 shades, tints and tones that progress from black, through dark gold to gold, to bright gold, to yellow then white. And after 4 or 5 days of playing I haven't a clue how to go about it. Any thoughts, speculations or suggestions gratefully received.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re: A data selection problem(in3D).
by pryrt (Prior) on Apr 06, 2017 at 14:10 UTC

    When doing a two-dimensional "s" shaped (sigmoid) curve, I often use a polynomial sigmoid*,**: f(x) = 3*x**2 - 2*x**3, 0 <= x <=1. You can also do similar 2-d "s" with higher-order polynomials. The general procedure is to take the general n-th order polynomial function, then define n+1 boundary conditions; it's useful to take enough derivatives to help with your boundary conditions. Then solve those n+1 equations for the n+1 arbitrary parameters to complete your For example, the cubic "s" above is defined as

    f(x) = a*x**3 + b*x**2 + c*x + d f'(x) = 3*a*x**2 + 2*b*x + 1*c f(0) = 0 # start at 0 f'(0) = 0 # want a flat slope f(1) = 1 # end at 1 f'(1) = 0 # but flat at this end, as well. solve: d = 0 # from f(0) c = 0 # from f'(0) b = 3, a=-2 # from f(1) and f'(1)

    ... and a quintic version would also define the second derivatives at 0 and 1 as 0. I've also done similar ones where I constrain the endpoints less, and instead move the center (so f(0.5) = 0.75 or f(0.8)=0.5 or similar). Some caveats is that it's really easy to accidentally define a function whose output goes beyond 0 or 1 while still inside the range of x, so sometimes I use inequalities to bound the output, though that gets more difficult to solve.

    This should be extensible into 3 dimensions:

    f(x,y) = a*x**2 + b*y**2 + c*x*y + d*x + e*y + g f(0,0) = 0 f(1,1) = 1 df/dx(0,0) = 0 df/dy(0,0) = 0 df/dx(1,1) = 0 df/dy(1,1) = 0

    (I don't know what shape that would make... I haven't gone thru and solved it). You would need to define boundary conditions that made sense for your HSV.

    Actually, maybe it's simpler than that: maybe you just want a parameter "p", which maps through three separate functions

    p:0..1 H(p) = a*p**3 + b*p**2 + c*p**1 + d I would recommend corners like: H(0) = Hmin H'(0) = 0 H(1) = Hmax H'(1) = 0 to get a similar "s" shape to 3*x**2 - 2*x**3, but scaled so the o +utput go from the min to the max you want S(p) = e*p**3 + f*p**2 + g*p**1 + h S(...) = Smin, 0, Smax, 0 V(p) = j*p**3 + k*p**2 + m*p**1 + n V(...) = Vmin, 0, Vmax, 0

    If you change the p=0..1 domain to whatever matches your domain of your gold scale (ie, if you determine the shade of gold by the height on your screw, or at least the height between screw threads -- I forget which from your previous discussion -- then your p would be equal to your height, and you would just define the p corners as height-min and height-max instead of 0 and 1). Thus, with a linear progression thru your p-parameter, you should get a nice "s"-shaped progression for each of your H, S, and V.

    --

    *: technically, a true Sigmoid function will asymptotically approach 0 and 1 off toward infinite-x, and will be centered at x=0. This "polynomial sigmoid" was chosen back in Perlin's early days as a "good enough approximation" that was easy to compute/pre-compute on his old 80s computers.

    **: I think I first encountered either in my neural network class back in the 90s, or when looking into Perlin noise for fun sometime thereafter.

    EDIT 1: fixed the H,S,V cubics' last terms...

    EDIT 2: change the last term of f(x,y) from an f (confusing) to a g (unambiguously different than f(x))

      Ignore my f(x,y) suggestion: that gives a surface in 3d space. What you want is a line in 3d space. To get that, use the (x,y,z) = (H(p),S(p),V(p)) parameterization.

      pryrt Thank you very much indeed. Thanks to your leads; I've ended up finding the Logistic function, which with its tunable slope and rotational symmetry is exactly the function I was looking for, despite that I didn't know it existed :)

      I'm still getting to grips with how the tunable parameter(s) affect the curve; and I need to deal with the fact that at steep slopes, the y values stay hard at the bottom and top of the range for too long -- I think I'll need to fiddle with x-range to ensure that I start and finish at the beginning and end of the curves rather than well before and after; but I think I can handle that.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.

      Thanks. That looks like the way to go. I'll post whatever comes out of it. Thanks.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.

      Having found the Logistic function; played with the parameters and understood their effects, I went to apply it to my problem and realised that I had reached the wrong conclusion about how I needed to curve my way through the dataset.

      In a nutshell, rather than this parameterised curve, in need this one (produced by rotating the first left 90° and then flipping the image vertically.)

      I've proved (to myself) that I have no intuition for what maths is likely to produce such results; if you have any clues to this new requirement I'd be a great help?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Suggestion 1: given a parameter p that goes from -1 to 1, non-inclusive, you could do f(p) = tan(p*pi/2), but that doesn't give much control of the slopes. Given the logistic(p,L,K) = L / (1 + exp(-k*p)), you could do something like f(p) = tan( (2*logistic(p,L,K)-1)*pi/2 ), which allows some tuning... but I'm still not sure it's really tunable enough

        Even before this additional wrinkle, I was cogitating that maybe going the route of a Bezier curve would be the right way to go: given a set of k coordinates P_i for i=(0..k-1), where each coordinate is in n-dimensional space, you could use a (k-1)th-order Bezier to "get near" each of those k points (exactly hitting the two endpoints). For example, if your k=5 coordinates were the (H,S,V) for white, yellow, bright gold, dark gold, black -- then you could do a quartic Bezier (the wp article goes to cubic, but quartic would just be f(t) = (1-t)**4 * P0 + 4*(1-t)**3 * t * P1 + 6*(1-t)**2 * t**2 * P2 + 4*(1-t) * t**3 * P3 + 1*t**4 * P4. Or it might be easier to do a piecewise quadratic or cubic Bezier. The benefits of various Bezier is you can plop those points anywhere (you can make a circle out of four piecewise Beziers) to make highly arbitrary points... and you can tune them to get wonderfully sharp slopes.

Re: A data selection problem(in3D).
by Anonymous Monk on Apr 06, 2017 at 10:37 UTC

    Hi,

    Do you have to select or can you generate?

    My idea is use the color picker tool to pick the 5 colors from the image you like, and then generate the colors inbetween, with total amount of steps that adds up to 256,

    like this "black body" colormap used is for photoshop fire effect

    code

    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
    ________________________________
      Do you have to select or can you generate?

      No, I don't have to select -- that image was essentially chosen at random (with some aesthetic judgement applied) from an image search for 'gold'; so its data is far from sacrosanct -- but I have tried generating a gold-scale and found it very difficult to do. My attempts so far have produced gradients with either distinct banding, or limited range (too much near black at the bottom and bright yellow near the top), or too strong a tint towards red or yellow throughout the gradient.

      Then I tried finding an image that contained a smooth transition from lowlite to hilite from which I could crop a smooth gradient; but they are either too short, resulting in banding when applied to a CG image; or they contain waves and distortions (due to surface irregularities or secondary reflections) that show up as non-linear transitions in CGs.

      That's when I hit on the idea of sampling an image (like the source image in the root node) and subsetting it 'somehow'.

      The problem (I think) is that, whether subsetting or generating, I need to transition through the relevant hsv space in 3 dimensions rather than 2; and probably not in a straight line either to boot.

      To explain: if you look at this image which shows 3 views of the same dataset.

      • The top-left image is all the datapoints overlain in the hsv-H dimension; but drawn pre-ordered by their hsv-H values so that the left-hand end of the original H-frames view is 'deepest' in the Z-axis, which means only a few of the blacks in the top left corner come from the early frames and the whites in the bottom left come from the right-hand frames.
      • The lower left view overlays the hsv-V values.
      • And the upper right overlays the hsv-S values.

      What I think those views show is that I need to select or generate a path through that subset of the hsv gamut, traversing in a kind of S-shape. But an S-shape in all 3 dimensions!

      That is, starting at the lower left of the top-left view, proceeding mostly right for the first (say) 10% then somewhat steeper than 45 degrees across the middle, and then at a shallow angle for the last 10% or so into the top-right corner.

      But also, tracing a similarly shallow S-shape (diagonally) in both other dimensions at the same time. bottom-left to top-right in the bottom left view; and bottom-right to top-left in the upper-right view.

      But once again, I'm stuck for a good way to do that?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: A data selection problem(in3D).
by karlgoethebier (Monsignor) on Apr 06, 2017 at 13:48 UTC

    Mmh, another quickshot: Perhaps you should take a look at Gimp's Perl API and Gimp? Just an idea for further investigation/inspiration.

    ĞThe Crux of the Biscuit is the Apostropheğ

    Furthermore I consider that Donald Trump must be impeached as soon as possible

Re: A data selection problem(in3D).
by coicles (Sexton) on Apr 08, 2017 at 01:14 UTC

    Maybe you could try something like this. First, this code creates a hash containing an entry for each unique color in an image. Then the colors are sorted by Luma value and broken into 17 groups of colors with similar Luma. Then, an average RGB triple is calculated for each group, weighted by the pixel count of each color in the input image. This produces a coarse gradient of 17 colors with ascending luma. A linear interpolation is then performed between these reference colors to get a gradient of 256 colors.

    I was lazy about two things in this code: The grouping of 17 colors is proportional to unique color count, rather than evenly distributing Luma differences. Also, a polynomial interpolation might be better than a linear one, although the linear result looks ok to me.

    Here is the gradient generated from the jpeg you linked:

    https://image.ibb.co/gMY085/grid.png

    I hope this helps.

    use strict; use GD; # GetReferenceColors returns a coarse gradient of colors typical # at several Luma levels sub GetReferenceColors { my ($img, $count) = @_; # build %indexes hash where the keys are the indexes of each uniqu +e color # and the values are the number of pixels counted for the color in +dex my ($width, $height) = $img->getBounds; my %indexes; for my $y (0..$height-1) { for my $x (0..$width-1) { my $idx = $img->getPixel($x, $y); ++$indexes{$idx}; } } # Build @colors array with one entry for each color # contains the color's RGB triple, its Luma (Y value) and pixel co +unt my @colors; for my $idx (keys %indexes) { my @rgb = $img->rgb($idx); my $y = $rgb[0]*0.299 + $rgb[1]*0.587 + $rgb[2]*0.114; my $pixel_count = $indexes{$idx}; push(@colors, { rgb => [@rgb], y => $y, count => $pixel_count +}); } # Sort @colors by ascending Luma value @colors = sort { $a->{y} <=> $b->{y} } @colors; # split @colors into $count groups, which overlap by one entry # calculate each group's average RGB value, weighted by pixel coun +t # add each group's average [r,g,b] triple to @ref_colors. my @ref_colors; my $step = @colors / $count; for my $i (0..$count - 1) { my $start = int($i * $step); my $end = int(($i + 1) * $step); my $wsum = 0; my @csum = (0, 0, 0); for my $j ($start .. $end) { my $color = $colors[$j]; my $weight = $color->{count}; $wsum += $weight; for my $ci (0..2) { $csum[$ci] += $color->{rgb}->[$ci] * $weight; } } for my $ci (0..2) { $csum[$ci] = int($csum[$ci] / $wsum + 0.5); } push(@ref_colors, \@csum ); } return \@ref_colors; } # InterpolateColors interpolates between two [r,g,b] triples # by a weight factor between 0 and 1 sub InterpolateColors { my ($ca, $cb, $pb) = @_; my @rgb; for my $i (0..2) { push(@rgb, int($ca->[$i] * (1 - $pb) + $cb->[$i] * $pb + 0.5)) +; } return \@rgb; } # Builds an interpolated set of colors based on an image's # most commonly occurring colors at a series of brightness levels sub InterpolatePalette { my ($img, $count) = @_; # Build a 256-entry @gradient by linearly interpolating between # a set of 17 colors returned by GetReferenceColors my @gradient; my $ref_colors = GetReferenceColors($img, 17); for my $i (1..@$ref_colors-1) { my $c0 = $ref_colors->[$i-1]; my $c1 = $ref_colors->[$i]; for my $j (0..15) { my $p = $j/16; push(@gradient, InterpolateColors($c0, $c1, $p)); } } return \@gradient; } # Read the input image from a file my $file = $ARGV[0] // '07_AH_Esfahan Gold 65-ab.jpg'; my $img = GD::Image->newFromJpeg($file); # Calculate an interpolated gradient from the image's dominant colors my $r = InterpolatePalette($img); # Create a new image for displaying the color gradient on a grid my $len = 20; my $width = 16*$len+1; my $grid = new GD::Image($width, $width, 1); # Draw the grid my $background = $grid->colorResolve(0, 0, 0); $grid->filledRectangle(0, 0, $width, $width, $background); my $loc = 0; for my $color (@$r) { my $x = ($loc & 15) * $len; my $y = ($loc >> 4) * $len; ++$loc; my $color = $grid->colorResolve(@$color); $grid->filledRectangle($x+1, $y+1, $x+$len-1, $y+$len-1, $color); } # Save the result open(my $fh, '>:raw', 'grid.png') or die $!; print $fh $grid->png;
      First, this code creates a hash containing an entry for each unique color in an image. Then the colors are sorted by Luma value and broken into 17 groups of colors with similar Luma. Then, an average RGB triple is calculated for each group, weighted by the pixel count of each color in the input image. This produces a coarse gradient of 17 colors with ascending luma. A linear interpolation is then performed between these reference colors to get a gradient of 256 colors.

      First: thank you for your response and code.

      However, there are problems with that approach.

      • Weighting the colors chosen by their pixel counts biases the selection according the amount of light and shade and the balance between light and shade within the source picture.

        The source picture is used to discover the range of tones and tints reflecting from the chosen material/surface, not their proportions.

        Once you apply the gradient to models, the proportions of light and shade are (need to be) dictated by the shape and lighting angles of the target model, not the source image.

      • Once you take away the weighting in the choice of the interpolation points, what you've effectively got is a straight forward linear interpolation through the colors present in the source image.

        The problem is that produces too wide a band of dark (near black) and light (near white) shades; and thus throws away too much of the primary shades that will dominate most(*) models.

      • Finally, by picking an average (weighted or otherwise) in the first and last groups, you guarantee to discard the darkest and lightest shades.

        That could be addressed by making 15 groups and then end-stopping with the darkest and lightest colors found; but that will tend to emphasis the final issue.

      • By interpolating (whether through rgb or hsv) between values present in the source range, you are likely to populate the gradient with colors that never appear in the source input, which unfortunately produces models that don't look right.

        This might be addressed by interpolating the between-chosen-points values and then going back to the dataset to find the 'nearest' value that exists there; but in my attempts, defining 'nearest' in a 3D space is fraught with problems, and inevitably results in uneven jumps in the gradient that stand out like sore thumbs when applied to a model.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.
        It is true that the end points lose the darkest and lightest shades -- this is another point I was too lazy to addres, but assumed you could rectify this. That source image only has 200 unique colors in it, and a lot of them are non-gold shades of gray -- this image does not contain a rich color selection -- so interpolation into absent colors is necessary. And your first point (undesired patterns of brightness density) would be addressed by using brightness levels, instead of the densities of unique colors at a similar brightness, to group the "reference" colors (which I was also too lazy to do here).
        On a different note: have you thought about an approach using clustering (like k-means)? This is what I initially thought of when I saw this problem, but then I had a negative experience with a module called Image::DominantColors.
Re: A data selection problem(in3D).
by Anonymous Monk on Apr 06, 2017 at 13:57 UTC

    I think you over-complicate it, a lot. Did I get it right, you want 1-to-1 map from 256 greyscale R=G=B, to some 256 "golden" RGB shades? So that you can simply map from images from your previous CUFP node, and, behold, these screws are now golden? Is that intended (and only) use of this magic palette?

      Is that intended (and only) use of this magic palette?

      What relevance is that? Do you question what use every petitioner will put the solution to before you deign to help? Either you have something that will help (and are willing to help me); or you don't, and move on.

      As for complicated; if I wanted a half-arsed solution, I had (and demonstrated) that a week ago.

      But finding an approach to finding a 'proper' solution makes it an interesting problem. To me at least. I hoped other might also find some joy in considering the matter.

      But if joy is too much for you....


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Sorry. They were relevant questions. Is this image good enough for you? I made it with couple lines of Perl from one of those screws. It's indexed, so you can take a look at palette. Which, yes, was extracted from gold ingots from your picture. If you intend to color grayscale images, then OK, it will work. If you meant something more complex (to me - something too much complex), though then I don't understand why you limit yourself to 256 shades, then OK, it won't work.

        Edit:

        pdl> ( $im, $palette ) = rimage( 'test2.png', { palette => 1 }) pdl> $x = sequence( 255 )-> dummy( 0, 100 )-> byte pdl> $x-> wimage( 'grad.png', { palette => $palette })

        Actually this then "grad.png" is exact solution to the problem as stated in OP:

        goal is to reduce those 62770 pixel values to a coherent, progressive, aesthetically pleasing 'gold scale' (after 'greyscale) of 256 values running from black to white, but encompassing as much of the range (nebulous term) of the full dataset as possible.
Re: A data selection problem(in3D).
by Anonymous Monk on Apr 06, 2017 at 21:30 UTC
    While you can always get good quality answers at our site, this is not a place to have others do your homework for you. You need to show that you have at least done a minimal amount of work on the problem otherwise we are just allowing you to do the same thing as copying your classmates answers. You are not doing yourself, your classmates or your teacher any favors by having us do your work for you.

      He he he he he! You made my day! (Me needing to do homework indeed. You certainly get a better class of ... hereabouts:)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1187228]
Approved by Corion
Front-paged by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2019-07-17 23:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?