Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Cleaning Up Apache::Session data

by grantm (Parson)
on Mar 31, 2003 at 02:36 UTC ( #246837=perlquestion: print w/replies, xml ) Need Help??

grantm has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Apache::Session to store session state in a PostgreSQL database. If the user hits the logout button, I can call the delete() method on the tied session object, but if the user just goes away, their session data lives on forever in the database table. What method do people use for cleaning up old sessions?

I considered adding a timestamp column to the session table and defining a trigger to update this column on any insert or update. Then I could run a simple delete statement to delete rows where the timestamp value is say more than a couple of hours old. This seemed like it might be reinventing the wheel but I can't see anything in the Apache::Session docs that looks like a standard way of doing it.

Super search pulled up this node which seems to imply that simply adding a column called 'time' would cause Apache::Session::MySQL to update the timestamp on each save, but I couldn't see anything in the code to support that assertion.

Replies are listed 'Best First'.
Re: Cleaning Up Apache::Session data
by iguanodon (Priest) on Mar 31, 2003 at 02:51 UTC
    I do exactly what you describe. I use Apache::Session with an Oracle back end, and a trigger updates a 'last used time' column. Then I run a cron job to delete sessions older than a given number of days. I find it useful to leave the sessions around for a few days so I can look at them to troubleshoot problems.

    The automatic updating of TIMESTAMP columns is a MySQL thing, it's not part of the Apache::Session code. I haven't used PostgreSQL in a while, so I can't remember if the equivalent functionality exists.

      In PostgreSQL

      ALTER TABLE tablename ADD COLUMN ts timestamp; ALTER TABLE tablename ALTER COLUMN ts SET DEFAULT CURRENT_TIMESTAMP;

      This will add a column which will get the time of the transaction on every insert automatically.

      You should use table and column names appropriate to your situation of course, also, if you are making the table you can pop the default in when you define the column like so.

      CREATE TABLE tablename ( ..., ts timestamp DEFAULT CURRENT_TIMESTAMP, +... );

      The insert needs to be  INSERT INTO tablename ( col1, col2 ) VALUES ( ?, ? ); rather than one which does INSERT INTO tablename VALUES ( ?, ? ); otherwise you will have a column count mismatch.

Cleaning Up Apache::Session data - Summary
by grantm (Parson) on Mar 31, 2003 at 22:43 UTC

    Thanks for the replies (including CB messages). To summarise ...

    dkubb was correct in Re: Re: Apache::Session::MySQL when he suggested simply adding a column of type 'timestamp'. The reason I couldn't see any code to implement this is that 'timestamp' is a MySQL built-in that automatically tracks the time a row was last updated.

    Since I'm using PostgreSQL, I don't have that exact functionality at my finger tips. dga suggested simply adding a timestamp column with a default value of the current time. This would store the timestamp when the session was created but would not increment the timestamp when the session data was subsequently updated. For 99% of applications, this is probably an adequate solution - if a session was created say 2 days ago then you can probably assume it is no longer being used.

    There may be some applications where you can't make that assumption. If sessions might be actively used for an arbitrary length of time, then you need to track when the session was last updated rather than when it was created. In PostgreSQL, you can achieve this with a trigger.

    To save someone else having to dig around in the PostgreSQL docs, here's what I did to make the trigger solution work. First of all, since my trigger will use a plpgsql function, I need to install the language in the database by running this from the command line (not from in psql):

    createlang -U postgres -P <password> plpgsql <dbname>

    Then I need to define the function (from within psql):

    CREATE FUNCTION timestamp_row () RETURNS OPAQUE AS ' BEGIN NEW.timestamp := timeofday(); RETURN NEW; END; ' LANGUAGE 'plpgsql';

    Finally, I need to create a trigger to call the function:

    CREATE TRIGGER timestamp_session BEFORE INSERT OR UPDATE ON sessions FOR EACH ROW EXECUTE PROCEDURE timestamp_row();

    Note: you can reuse the same function if you want the same timestamping functionality on other tables. Note also, that the function assumes the standard sessions table definition has been altered like this:

    ALTER TABLE sessions ADD COLUMN timestamp timestamp; ALTER TABLE sessions ALTER COLUMN timestamp SET DEFAULT CURRENT_TIMEST +AMP;

    To re-iterate, this trigger-based solution is only necessary in those cases where the creation timestamp is not good enough. The downside is that the extra overhead of firing the trigger will occur on every update (which presumably will be on every request if the timestamp is to be meaningful).

    Detour: While implementing the trigger, I came up against an interesting phenomenom - as I repeatedly requested pages, the timestamp value for my session record sometimes went forwards and sometimes went backwards in time (yikes!). The reason for this was that I had used 'CURRENT_TIMESTAMP' to set the timestamp value. This function is the ANSI equivalent of the PostgreSQL 'now' function, which behaves the same way. Rather than returning the exact current time, CURRENT_TIMESTAMP returns the time at the start of the current transaction. Since I am using Apache::DBI and have enabled autocommit, the start of the current transaction is really the time the last transaction completed. If my request happened to be handled by an Apache process that hadn't been hit recently then its last transaction may have happened quite some time ago and the timestamp would be misleadingly old. To get around this, I used the timeofday() function instead of CURRENT_TIMESTAMP. Unfortunately, I guess this leads to even more overhead on every hit. Only you can decide whether you really need this level of accuracy on your session timestamps.

      ++ for the detailed summary. The only thing I would change is the name of the column. I try to avoid naming tables or columns after reserved words. In my own session tables, I usually call the column 'lastrequest' or something like that.

      90% of every Perl application is already written.
      dragonchild
Re: Cleaning Up Apache::Session data
by CountZero (Bishop) on Mar 31, 2003 at 12:50 UTC

    Directly from the mySQL docs:

    6.2.2.2 The DATETIME, DATE, and TIMESTAMP Types

    The DATETIME, DATE, and TIMESTAMP types are related. This section describes their characteristics, how they are similar, and how they differ.

    The DATETIME type is used when you need values that contain both date and time information. MySQL retrieves and displays DATETIME values in 'YYYY-MM-DD HH:MM:SS' format. The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. (``Supported'' means that although earlier values might work, there is no guarantee that they will.)

    The DATE type is used when you need only a date value, without a time part. MySQL retrieves and displays DATE values in 'YYYY-MM-DD' format. The supported range is '1000-01-01' to '9999-12-31'.

    The TIMESTAMP column type provides a type that you can use to automatically mark INSERT or UPDATE operations with the current date and time. If you have multiple TIMESTAMP columns, only the first one is updated automatically.

    Automatic updating of the first TIMESTAMP column occurs under any of the following conditions:

    • The column is not specified explicitly in an INSERT or LOAD DATA INFILE statement.
    • The column is not specified explicitly in an UPDATE statement and some other column changes value. (Note that an UPDATE that sets a column to the value it already has will not cause the TIMESTAMP column to be updated, because if you set a column to its current value, MySQL ignores the update for efficiency.)
    • You explicitly set the TIMESTAMP column to NULL. TIMESTAMP columns other than the first may also be set to the current date and time. Just set the column to NULL or to NOW().

    You can set any TIMESTAMP column to a value different from the current date and time by setting it explicitly to the desired value. This is true even for the first TIMESTAMP column. You can use this property if, for example, you want a TIMESTAMP to be set to the current date and time when you create a row, but not to be changed whenever the row is updated later:

    Let MySQL set the column when the row is created. This will initialise it to the current date and time.

    When you perform subsequent updates to other columns in the row, set the TIMESTAMP column explicitly to its current value.

    On the other hand, you may find it just as easy to use a DATETIME column that you initialise to NOW() when the row is created and leave alone for subsequent updates.

    Update: As hardburn pointed out, the original poster is using PostgreSQL. STill, nice to see that mySQL has some functionality which is lacking in PostgreSQL ;-)

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      The orginal poster is using PostgreSQL. I don't use PostgreSQL much, but I did a quick scan over thier docs. PostgreSQL doesn't appear to have the same autmatic TIMESTAMP updates that MySQL has. Someone more familer with PostgreSQL will have to verify this for me, though.

      ----
      I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
      -- Schemer

      Note: All code is untested, unless otherwise stated

        PostgreSQL has triggers. You can implement the same functionality with a simple BEFORE UPDATE/INSERT trigger.

        90% of every Perl application is already written.
        dragonchild

        PostgreSQL allows setting of a default value for a column, triggers are not needed for this task. See my reply to another node in this thread at Re: Re: Cleaning Up Apache::Session data.

        Update: Triggers are what's desired here since he wants to update the rows ald set the time also.

Re: Cleaning Up Apache::Session data
by aquarium (Curate) on Mar 31, 2003 at 11:19 UTC
    sorry this is not a concrete/final answer for you but, what about using temporary tables? Chris ..and on the seventh day h[H^e created Perl

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://246837]
Approved by Aristotle
Front-paged by tye
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2021-05-11 07:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (114 votes). Check out past polls.

    Notices?