Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re^4: Using filepath method to identify an .html page

by Nik
on Jan 22, 2013 at 15:48 UTC ( #1014703=note: print w/replies, xml ) Need Help??

in reply to Re^3: Using filepath method to identify an .html page
in thread Using filepath method to identify an .html page

Integers need to be created on the fly when an html file is being requested.
I just tried:
pin = int( htmlpage.encode("hex"), 16 )
but that also fails.

The number needs to be a 4-digit integer only, if its to be stored in the database table correctly. So hex encoding is not usefull here.

Replies are listed 'Best First'.
Re^5: Using filepath method to identify an .html page
by Corion (Pope) on Jan 22, 2013 at 15:52 UTC

    I did not mention a "hex encoding" anywhere in my reply. I suggest you actually read my reply, and use a database, or a "key value store", and build a map of words to numbers. that would be complex.
      I just need a way to CONVERT a string(absolute path) to a 4-digit unique number with INT!!! That's all i want!! But i cannot make it work :(

      And the best part is that "that" number must be able to turn back into a path.


      1. User requests a specific html page( .htaccess gives my script the absolute path for that .html page)
      2. I turn the path into a 4-digitnumber
      3. i store that number to the database. I DONT EVEN HAVE TO STORE THE PATH TO THE DATABASE ANYMORE!!! this is just great!

        Maybe there is confusion about how you really want to go about this because the spec as given is ludicrous. You have four digits to work with which means 10,000 numbers available. Assuming only lowercase letters for example (like abc.html)–

        perl -le 'print 24**3' 13824

        You're already out of room with this most trivial example. Any real file names/paths will certainly not be able to fit in any translation scheme. The information must be *somewhere*. Your quest to save space by disappearing it seems magical. Pretty much everyone here is telling it to you straight.

        If there is still a misunderstanding, perhaps you could give a concrete example of input (and its range) and output you expect. If what you want is possible someone will help you.

        So, what you want is a function foo() and its inverse foo'() that does this:

        foo( "some long string" )  -->  1234
        foo'( 1234 ) --> "some long string"


        In other words, you want to keep the entire information content of the original string.

        There are only two ways to do this:

        1. keep the original string, i.e. store it in a database of some sort, or
        2. lossless compression of the original string into a short number. While theoretically possible, the compression ratio you're asking for isn't going to be possible.

        Which leaves the first option, as plenty of people here have described.

        So: You've said you already have a database with a column that stores a 4-digit number. If that database table doesn't already have a column that stores the HTML page's absolute path, then add one. You'll also want a UNIQUE constraint on the column with the number (and/or the other column, depending on your database design). The rest is "just" SQL...

        how can you turn the "path" into a 4 digit number? that makes no sense. you maybe could generate a checksum on each path, but checksums are more than 4 digits long. unless you use a large base. like base 1000 instead of base 10, but then you'd have to invent a bunch of characters.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1014703]
[ovedpo15]: consider the following format of strings: some_data-doesn't- matter,value. how I get the value with regex? it should be after the last comma (last string)

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2018-05-27 08:46 GMT
Find Nodes?
    Voting Booth?