Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Need to make text file archive

by peppiv (Curate)
on Dec 12, 2001 at 19:35 UTC ( #131257=perlquestion: print w/replies, xml ) Need Help??
peppiv has asked for the wisdom of the Perl Monks concerning the following question:

Our company's top exec likes to write articles. He wants me to post them on our Intranet. He wants me to put together an archived list of his articles with the most recently written on top and descending in order of time.

I want to write the title to the HTML page making it a link to the actual article (preferrably text file). I can cut and paste his article into a textbox and save it. What's the best way to save these articles that would make them easy to post in chronological order?

Thanks in advance,
peppiv

Replies are listed 'Best First'.
Re: Need to make text file archive
by Corion (Pope) on Dec 12, 2001 at 20:00 UTC

    Your question is a bit vague, so I assume that you have all the articles of the Big Boss as text files. For the lowest resistance solution, I would do the task like this :

    • All articles get named yyyy-mm-dd-Title_with_spaces_converted_to_underlines.txt
    • A Perl script runs nightly (or on demand), and recreates the HTML page by reading the directory of the articles. Alphabetical sorting (in reverse order) automatically creates the order you want (no, you don't want to sort on file creation/modification time!).

    Here's some really simple code to get you started :

    use strict; my $article_directory = $ARGV[0] || "."; my $base_url = $ARGV[1] || "http://your.boss.net/BigBoss/articles/"; my @articles; print "Reading articles from $article_directory"; opendir DIR, $article_directory or die "Couldn't find $article_directo +ry : $!\n"; @articles = reverse sort grep /\.txt$/ readdir DIR; closedir DIR; print "$#articles articles found.\n"; # Now we have all articles in order. Let's print them out : open HTML, "> $article_directory/index.html" or die "Couldn't create i +ndex.html in $article_directory : $!" # Change the HTML to your taste print HTML "<html><body>" my $article; foreach $article (@articles) { print "."; my $title = $article; my $date = "No date given"; if ($article =~ /^(\d{4})-(\d{2})-(\d{2})-(.*).txt$/) { $date = "$3.$2.$1"; $title = $4; $title =~ tr/_/ /; }; print HTML "$date : <a href='$base_url$article'>$title</a>\n" }; print HTML "</body></html> close HTML;
    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: Need to make text file archive
by count0 (Friar) on Dec 12, 2001 at 19:52 UTC
    What's the best way to save these articles that would make them easy to post in chronological order?

    IMO, there are a few main options.
    • You can store them in an RDBMS, with a date field.
    • You could also keep a directory structure that reflects the dates (articles/2001/12/11.1.html, 11.2.html, etc - or whatever best suits you).
    • You could even keep a header in each article, in a predetermined format, that contains the date.
    • An XML DTD could be drawn up to contain the information for each article (including specifically the date).

    These are just a few examples of ways to handle it... all of which are very well suited to be done in Perl. =)
Re: Need to make text file archive
by jaldhar (Vicar) on Dec 12, 2001 at 20:01 UTC

    The easiest way would be to encode the date in the filename of the text you save. Then your index script can just read all the files in the diretory and sort them by filename. But that can look ugly so another thing I've done in the past is create directories called 2000, 2001 ... and within them, 2001/01, 2001/02 ... etc. and saved documents there. The index generation becomes slightly more difficult because you have to use File::Find instead of just readdir but only slightly. Plus if the list gets too long, you can easily break it up into several pages by year.

    In either case, the principle of operation is the same — encode metadata in the file name so you don't have to carry extra files/DBs around.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://131257]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2018-11-15 08:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (182 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!