Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

HandX weblog parser

by JaWi (Hermit)
on Aug 22, 2002 at 08:56 UTC ( #191955=snippet: print w/replies, xml ) Need Help??
Description: HandX makes IMHO a great utility: webLog, which allows you to enter your weblog entries on your Palm device. The program is, unfortunately, only supporting Windows as host.
This snippet parses the database file (weblog_hXwl.pdb) and just writes them to a file (or STDOUT). It is easily modified to be used directly for websites, etc.

-- JaWi

#!/usr/bin/perl -w

use strict;

my ( @categories, @record_attributes );
my ( $input, $output );

die usage() unless defined $ARGV[ 0 ] && -e $ARGV[ 0 ];

$input = $ARGV[ 0 ] if ( -e $ARGV[ 0 ] );
if ( defined $ARGV[ 1 ] )
{
  print STDERR "Warning: overwriting '$ARGV[1]'!\n" if ( -e $ARGV[ 1 ]
+ );
  $output = $ARGV[ 1 ];
}

parse_pdb( $input, $output );


sub usage
{
  return "$0 v1.0 - WebLog database parser.\n".
        "(C) 2002 - J.W. Janssen, janwillem.janssen\@lxtreme.nl\n\n".
        "usage: $0 <weblog_hXwl.pdb> [<output file>]\n";
} # usage


sub parse_pdb
{
  my $name = shift;
  my $output = shift;

  # Obtain the necessary header information
  open PDB, "<$name" || die "Couldn't open '$name'";
  binmode PDB;
  my $line;
  read( PDB, $line, 78 );
  my @header = unpack "A32nnN6A4A4NNn", $line;

  # Read & parse the record entry headers...
  my @offsets;

  # The record entry headers start immediatly after the header
  #     of the PDB file (offset 78).
  seek( PDB, 78, 0 );
  # The header gives us the amount of entries in the PDB file.
  for ( 0..$header[ 13 ] )
  {
    read( PDB, $line, 8 );
    # Entry record:
    # name      | # | description
    # ----------+---+----------------------
    # offset    | 0 | the entry file offset
    # attribute | 1 | the upper 4 bits denote the record
    #           |   |  attribute, and the lower 4 the
    #           |   |  category.
    # uniqueID  | 2 | unknown; should be zero.
    my @record_list = unpack "NC4", $line;

    # Category == unknown
    my $category = $record_list[ 1 ] & 0x0F;
    # Attribute == 0x10: secret record bit
    #              0x20: busy bit (in use)
    #              0x40: dirty bit
    #              0x80: delete on next sync
    my $attribute = $record_list[ 1 ] & 0xF0;

    push @offsets, $record_list[ 0 ];
    push @record_attributes, $attribute;
  }
  # Make sure the last offset is the filesize...
  $offsets[ -1 ] = int( -s $name );

  # Open our output channel ...
  if ( defined $output and $output ne "-" )
  {
    open OUT, ">$output" || die "Couldn't open '$output' for writing!"
+;
  }
  else
  {
    open OUT, ">&STDOUT";
  }

  # Converting from Palm date format <-> UNIX format:
  #     number of days between
  #       01/01/1904 and 01/01/1970:           24,107
  #     number of seconds in one day:          86,400 *
  #                                     ---------------
  #     number of seconds between
  #       01/01/1904 and 01/01/1940:    2,082,844,800
  my $diff_secs = 24107*86400;

  print OUT "Last modified on: " . gmtime( $header[ 4 ] - $diff_secs )
+ . "\n";

  # Read as many entries...
  for ( 0..$header[ 13 ] - 1 )
  {
    my $offset = $offsets[ $_ ];
    my $rec_length = $offsets[ $_ + 1 ] - $offset;

    seek( PDB, $offset, 0 );
    read( PDB, $line, $rec_length );

    # Record format for WebLog:
    # name       | # | length  | description
    # -----------+---+---------+------------------------------------
    # record idx | 0 | 2 bytes | the logical entry number.
    # ?          | 1 | 2 bytes | unknown
    # entry date | 2 | 4 bytes | number of seconds since 01/01/1904,
    # entry text | 3 | ? bytes | the actual entry...
    my ( $index, $unknown, $date, $text ) = unpack "nnNa*", $line;
    my $hidden = ( $record_attributes[ $_ ] & 0x10 ) ? 1 : 0;

    print OUT "($index, $hidden) " . gmtime( $date - $diff_secs ) .
                " - " . $text . "\n";
  }

  close OUT;
  close PDB;
} # parse_pdb
Replies are listed 'Best First'.
Re: HandX weblog parser
by Mr. Muskrat (Canon) on Aug 22, 2002 at 14:02 UTC

    "The program is, unfortunately, only supporting Windows as host."

    Why? I don't see any obvious reason that it wouldn't work on *nix... or any other operating system for that matter.

      Hmm, I should have picked my words a little better: what I mend was that there's only a conduit program for Windows available, which automatically can store the Weblog entries to your web-server. For other operating systems, you should write them yourself.

      -- JaWi

      "A chicken is an egg's way of producing more eggs."

        But there are conduits available for other operating systems!
        Palm Power Magazine's article on connecting with Linux.
        Librenix has some articles on connecting Palm and other handhelds to a Linux box.
        Just Google for other OSes... I think you will find BSD, Solaris, etc all can sync with a handheld device.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://191955]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2022-12-02 10:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?