Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: (OT) cgi: relative v. absolute paths, Apache

by merlyn (Sage)
on Nov 19, 2009 at 16:45 UTC ( #808203=note: print w/ replies, xml ) Need Help??


in reply to (OT) cgi: relative v. absolute paths, Apache

Relative URLs are interpreted and calculated by the browser, not by the server. So you don't need to consider the actual disk layout of your directoriesójust look at the URL. If you refer to an image at "../my_imgs/pic.jpg" from a page that was fetched at "/cgi-bin/somescript", the browser subtracts /cgi-bin, and requests "/my_imgs/pic.jpg".

It's all about the browser.

-- Randal L. Schwartz, Perl hacker

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.


Comment on Re: (OT) cgi: relative v. absolute paths, Apache
Re^2: (OT) cgi: relative v. absolute paths, Apache
by MidLifeXis (Prior) on Nov 19, 2009 at 17:11 UTC

    As merlin states above, it is all about the browser.

    Your apache configuration defines a mapping that creates a virtual directory tree in web space. In it's stock form, I believe that apache maps something like this (images directory from your example):

    Web pathDisk path
    /$SERVER_ROOT/htdocs
    /cgi-bin$SERVER_ROOT/cgi-bin
    /my_imgs/pic.jpg $SERVER_ROOT/htdocs/my_imgs/pic.jpg

    So, when your cgi script (href=/cgi-bin/myscript.cgi) runs, the relative path "../my_imgs/pic.jpg" would refer, from the browser's perspective, to /cgi-bin/../my_imgs/pic.jpg, or /my_imgs/pic.jpg, which is what you have as the absolute path.

    Hope this helps.

    Update (2009/11/20 11:36 GMT-0500): Whoops. s/\$DOCUMENT_ROOT/$SERVER_ROOT/g; $SERVER_ROOT == ServerRoot setting in httpd.conf. $DOCUMENT_ROOT = $SERVER_ROOT/htdocs.

    --MidLifeXis

      Hi,

      Thanks for the responses. I don't get the url math, though. Can you direct me to a tutorial?

      The only way I can understand it is if I say to myself, "The current directory is cgi-bin, and the ../ directory refers to cgi-bin's parent directory. Then according to the virtual directory structure, cgi-bin's parent directory is the root directory, which is htdocs, and then you descend from htdocs to the my_imgs directory--where the file is found.

        Ok, I think I get it now. Relative paths are resolved based on the directory structure specified in the request url. For instance, suppose you have a web page that is requested using the following url:

        www.acme.com/dir1/dir2/page.htm

        You can see a directory structure in that url, and that directory structure will be used to turn relative paths into absolute paths. If page.htm contains a link to another page and the link specifies a relative path:

        <a href = "../results.htm">click me</a>
        then you look at the url in the original request to sort out the path:

        www.acme.com/dir1/dir2/page.htm

        In this case, the original page.htm is in the directory dir2, which is the current directory. Looking at the request url, the parent directory of dir2 is dir1. Therefore, the browser resolves the relative path:

        ../results.htm

        into the absolute path: www.acme.com/dir1/results.htm

        If the href had used the path: ../../dir3/results.htm, then the browser would resolve the relative path into the absolute path:

        href = "../../dir3/results.htm" <-----relative path
        www.acme.com/dir1/dir2/page.htm <---request url
        www.acme.com/dir3/results.htm <----absolute path

        In my case, the url that was used to request my cgi script was:

        http://localhost/cgi-bin/prog1.pl

        The current directory in that url is cgi-bin. So when the perl script produced an <img> tag with the relative path:

        <img src="../my_imgs/blue_square.jpg">

        the parent directory was localhost. Therefore, the relative url corresponds to the absolute url: localhost/my_imgs/blue_sqare.jpg.

        Now it's Apaches turn. Apache maps the host, in this case localhost (in the first example it was www.acme.com), to the htdocs directory. Therefore, the url tells Apache to look for a subdirectory in htdocs called my_imgs and then looks in my_imgs for a file called blue_square.jpg.

        Almost. You convert from virtual to real too early.

        The current virtual directory is /cgi-bin, and the ../ directory refers to /cgi-bin's parent directory, the root directory. Then you descend to the my_imgs directory and the pics.jpg file. The final url is therefore /my_imgs/pic.jpg. The server maps that url to the file htdocs/my_imgs/pic.jpg.

      Web path	 Disk path
      /	         $DOCUMENT_ROOT/htdocs
      /cgi-bin	 $DOCUMENT_ROOT/cgi-bin
      /my_imgs/pic.jpg $DOCUMENT_ROOT/htdocs/my_imgs/pic.jpg
      
      Hope this helps.

      I don't think that is correct. My DocumentRoot is set to /Library/Apache2/htdocs, so it wouldn't make sense to say that the web path for / is the disk path $DOCUMENT_ROOT/htdocs, which would be /Library/Apache2/htdocs/htdocs. I found this on the apache website:

      DocumentRoot directive
      
      Syntax: DocumentRoot directory-path
      Default: DocumentRoot /usr/local/apache/htdocs
      Context: server config, virtual host
      Status: core
      
      This directive sets the directory from which httpd will serve files. 
      Unless matched by a directive like Alias, the server appends the
      path from the requested URL to the document root to make the
      path to the document. Example:
      
          DocumentRoot /usr/web
      
      then an access to http://www.my.host.com/index.html refers to /usr/web/index.html.
      
      There appears to be a bug in mod_dir which causes problems
      when the DocumentRoot has a trailing slash (i.e., "DocumentRoot
      /usr/web/") so please avoid that.
      

      One thing I discovered: when a browser converts a relative path to an absolute path prior to requesting a resource, if a relative path tries to move up the hierarchy of a url too far with ../../../, the extra ones are ignored. For instance, if the page's url is:

      http://localhost/cgi-bin/prog1.pl

      the current directory is cgi-bin. However, in that url cgi-bin does not have a parent directory. Therefore, if the cgi script produces a page with an image that uses this relative path::

      <img src="../../../../my_imgs/blue_square.jpg"

      the ../../../../ part of the relative path just gets you:

      http://localhost

      then the rest of the path, /my_imgs/blue_square.jpg, gets appended to that, giving you:

      http://localhost/my_imgs/blue_square.jpg

      Subsequently, when apache receives the request for that url, as the passage from the apache website above says, everything after the host gets appended to the document root, which in my case yields this:

      /Library/Apache2/htdocs/my_imgs/blue_square.jpg

      That is a real path on the filesystem. To summarize there is a two step process:

      1) The browser converts a relative path (used by an html element on a page) to an absolute path by looking at the page's url, then sends a request for that url to the Apache server.

      2) Apache takes the part of the url after the host name and appends it to the DocumentRoot (as specified in httpd.conf). For instance, if apache receives a request for this url

      http://www.mysite.com/dir1/dir2/page.htm

      the host name is www.mysite.com, and with my DocumentRoot (= /Library/Apache2/htdocs) Apache would create the following path to the requested resource:

      /Library/Apace2/htdos/dir1/dir2/page.htm

      That's my current mental model of what's going on. I'll adjust it as required.

        See above for correction :-)

        As far as your understanding of the translation, I think you have it. As with a "real" file system, cd .. from the root directory will still leave you at the root directory, and by extension (ignoring symlinks, the shell's idea of $cwd, and so on), cd ../../../foo from the directory /bar/blatz will traverse to /bar, /, /, and then /foo. The same thing holds with the virtual directory structure on the web server.

        If you are familiar with the concept of mount points, there are many directives in the Apache configuration language that can create the moral equivalence of a mount point. The stock cgi-bin directory is one, Alias is another. The client sees a view of this virtual directory structure as defined by the directives in the Apache control files.

        --MidLifeXis

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://808203]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2014-11-27 20:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (188 votes), past polls