Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re^5: Using filepath method to identify an .html page

by Corion (Pope)
on Jan 22, 2013 at 15:52 UTC ( #1014705=note: print w/replies, xml ) Need Help??

in reply to Re^4: Using filepath method to identify an .html page
in thread Using filepath method to identify an .html page

I did not mention a "hex encoding" anywhere in my reply. I suggest you actually read my reply, and use a database, or a "key value store", and build a map of words to numbers.

  • Comment on Re^5: Using filepath method to identify an .html page

Replies are listed 'Best First'.
Re^6: Using filepath method to identify an .html page
by Nik on Jan 22, 2013 at 15:55 UTC that would be complex.
    I just need a way to CONVERT a string(absolute path) to a 4-digit unique number with INT!!! That's all i want!! But i cannot make it work :(

    And the best part is that "that" number must be able to turn back into a path.


    1. User requests a specific html page( .htaccess gives my script the absolute path for that .html page)
    2. I turn the path into a 4-digitnumber
    3. i store that number to the database. I DONT EVEN HAVE TO STORE THE PATH TO THE DATABASE ANYMORE!!! this is just great!

      Maybe there is confusion about how you really want to go about this because the spec as given is ludicrous. You have four digits to work with which means 10,000 numbers available. Assuming only lowercase letters for example (like abc.html)–

      perl -le 'print 24**3' 13824

      You're already out of room with this most trivial example. Any real file names/paths will certainly not be able to fit in any translation scheme. The information must be *somewhere*. Your quest to save space by disappearing it seems magical. Pretty much everyone here is telling it to you straight.

      If there is still a misunderstanding, perhaps you could give a concrete example of input (and its range) and output you expect. If what you want is possible someone will help you.

        The third of Clarke's three laws is at play here:

        Any sufficiently advanced technology is indistinguishable from magic.

        "sufficiently advanced" is really a relative term. To one who understands it, it is not sufficiently advanced to be indistinguishable from magic. But to one who doesn't understand it, the line between reality and magic is obscured.

        Once technology becomes indistinguishable from magic, it becomes impossible to distinguish between what is possible and what is impossible. With magic everything should be possible, right? What we need to do is shed our understanding so that the technology we're discussing appears to us as magic as it does to someone who believes that through the magic of technology everything must be possible. Only then will we be able to come up with solutions based on the boundless nature magic rather than the finite constraints of well-understood technology.


      So, what you want is a function foo() and its inverse foo'() that does this:

      foo( "some long string" )  -->  1234
      foo'( 1234 ) --> "some long string"


      In other words, you want to keep the entire information content of the original string.

      There are only two ways to do this:

      1. keep the original string, i.e. store it in a database of some sort, or
      2. lossless compression of the original string into a short number. While theoretically possible, the compression ratio you're asking for isn't going to be possible.

      Which leaves the first option, as plenty of people here have described.

      So: You've said you already have a database with a column that stores a 4-digit number. If that database table doesn't already have a column that stores the HTML page's absolute path, then add one. You'll also want a UNIQUE constraint on the column with the number (and/or the other column, depending on your database design). The rest is "just" SQL...

      how can you turn the "path" into a 4 digit number? that makes no sense. you maybe could generate a checksum on each path, but checksums are more than 4 digits long. unless you use a large base. like base 1000 instead of base 10, but then you'd have to invent a bunch of characters.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1014705]
[LanX]: seems like my boss has activated an extra UTF8 encoding such that my JSON stuff arives twice encoded in the browser ... oO
[LanX]: he loves to do this with regexes ...
LanX considers looking for a new project ...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2018-03-19 23:24 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (246 votes). Check out past polls.