Re: An odd flat file database question
by eg (Friar) on Feb 08, 2001 at 06:46 UTC
|
I've read through this post a couple of times, and I don't quite see what your question is.
If you've got a 10k line flat-file, I think it's time for you to consider moving to a more flexible and robust database system. Mysql, PostgreSQL and the venerable Berkeley DB system are popular free (at least as in beer) options for you.
You "control number system" is essentially the unique primary key in your database. It's a pretty standard practice. In what way do you find it "cumbersome". If it's a problem with re-keying your database every time you add or remove an item, my advice is don't. When you remove an item, that primary key is gone forever, when you add an item, just increment your max primary key. Save a copy of the max primary key somewhere else so you don't need to run through your entire database to find out what it is. Since your primary query is by catalog number, keep your database sorted by that. (If I've guessed your problem wrong, then I apologize. :)
It also sounds like you have a database normalization problem that you'll want to address eventually, but I guess that can wait.
Update: sounds like you want to learn how to tie an array ... maybe check out DB_File.
| [reply] |
|
Thanks for your information about the database normalization. I'm reading that right now. I am a total novice in database programming. I have never even messed with any SQL stuff before and have dealt solely with flat file databases. I would really like to learn how to use SQL databases but I really don't know where to begin learning. Can you guys point me in the right direction? Also, the server this site I'm working with is hosted on has no useful documentation. How in the world do I even find out whether there is an SQL server installed on it. Pardon all my ignorance I'm kind of a newbie.
| [reply] |
|
Chances are, if this is a closed system and you have no idea if there is a SQL server somewhere hidden, then there probably is no SQL server. If you can, download MySQL and the docs will be installed along with it. You might start also here in our Tutorial section on DBI. I performed a simple web search on google for "MySQL tutorial" and came up with this helpful link. With so many entries, it's time to upgrade, otherwise you're not only wasting your own time, but precious processing time.
AgentM Systems nor Nasca Enterprises nor
Bone::Easy nor Macperl is responsible for the
comments made by
AgentM. Remember, you can build any logical system with NOR.
| [reply] |
Re: An odd flat file database question
by merlyn (Sage) on Feb 08, 2001 at 06:30 UTC
|
I'm trying to think of a way in which you would not get the endpoints of a range. Perhaps if you showed some code, we could help more.
-- Randal L. Schwartz, Perl hacker | [reply] |
|
If there are 10 records with the final catalog number, I
can easily see naive code only printing one of them.
| [reply] |
|
Hey Merlyn,
My code accepts two arguements: First and Last. It just goes through the database and when it finds the first, it starts a loop that prints and increments a counter until it reaches the last number. It then prints the footer and exits. Really simple, really bulky, really sloppy, and a pain in the rear end to update. Also, I'm reading in the entire file into an array each time I do this. The file is about 250k. Do you think that it is pushing it to do that? Is there any quicker, perhaps more memory efficient way to do it? I'd give you the code, but I really don't think it will do any good after explaining what I did above. Let me know.
| [reply] |
|
There are a couple, but they all involve changing the format in which your data is stored. As others in this thread have mentioned, you can use DB_File and its ilk, or (my recommendation) an RDBMS like MySQL or Postgres (both available for the cost of a download, with plenty of online help). Perl's DBI interface is robust and mature, so you shouldn't have too much trouble making the transition. It can't hurt to know some SQL =)
Philosophy can be made out of anything. Or less -- Jerry A. Fodor
| [reply] |
Re: An odd flat file database question
by coolmichael (Deacon) on Feb 08, 2001 at 08:27 UTC
|
if you assume that $first and $last are the first and last catalog numbers the person wants, you could do this:
open FILE, "somefile.txt" || die "Blah: $!\n";
while(<FILE>){
m/^(\d+)\|/;
print if(($first <= $1) and ($1 <= $last));
}
I'm not sure the regex is syntatically right. It's supposed to match all digits from the start of the line to the first pipe. it isn't tested, but I think it will work.
Michael. | [reply] [d/l] |
Re: An odd flat file database question
by Anonymous Monk on Feb 08, 2001 at 19:21 UTC
|
This is not a nice way of doing it, but would ensure that every correct catalog number is caught. The trick is to run a loop going through every line (yeah ... all 10,100) and comparing the number to the range wanted. The loop wouldn't exit at the first out of range catalog number. It's not very neat but it gets the job done.
Alternatively if the catalog is in order (ie: the file is ordered by ascending catalog numbers), you could exit the loop when the catalog number is equal to $last+1. Thereby catching all the references to $last. | [reply] |
Re: An odd flat file database question
by ftforger (Sexton) on Feb 09, 2001 at 00:38 UTC
|
While your db may have other problems (which others have
made suggestions about) I'll answer with one
method of doing what you are asking for. Load your file
into @array. Then start from the top of the array looking
for a match on your first key. The first match is the
starting index for your output. Next, use a reverse for
loop, ie start at the last line of the array and decrement
by one, until you get a match. That is the last line you
want to output. Then simply print out everything from your
first index to your second index.
A quick fix that will hold off the dogs while you are
learning to program a real dbi. | [reply] |
Re: An odd flat file database question
by Anonymous Monk on Feb 09, 2001 at 01:41 UTC
|
If you have no access to any kind of database (eg:MySql)
you could use hash tables.
Otherwise:
1.Create two hash tables A & B.
Let A contain Catalog# as its key and frequency as its value.
Let B contains "Catalog#|seq#" as its key and pipe delimited fields as its value (String).
seq# <- The frequency (incremented value) in hash "A" .
2. When a user enters the range of catalog#s:
You may use the sorted list of key values from hash "A" to get all unique catalog#s and its corresponding frequency.
And utilize a small loop to fetch all the records user wanted from hash "B".
(Note: It may take time to load your data into hash table.)
Uma.
(umasuga@home.com) | [reply] |
Re: An odd flat file database question
by 2501 (Pilgrim) on Feb 08, 2001 at 07:17 UTC
|
I am a little loopy tonight
are you looking for something like:
while(<catalog>){
print "$_\n" if(/$catnum/ ... /$catnum2/);
}
| [reply] [d/l] |
|
| [reply] |
Re: An odd flat file database question
by Stamp_Guy (Monk) on Feb 10, 2001 at 23:31 UTC
|
Seems that everyone is saying "GO SQL!", so I've started to do some research into it, but I've got a few questions:
- Is is possible to set it up so I could develop the program on my Win98 desktop and have it completely portable to a Linux server with minimal changes? Basically: is it cross-platform?
- How hard is it to learn SQL?
- Should I get a really good handle on Perl and Linux before I start messing with any SQL stuff?
- Is there any natively supported database system that Perl uses? I've been reading a book called Teach Yourself Perl 5 in 24 Hours (yea right, 24 hours!) and it had a little example of a program that uses DBMs. Where could I find some more information in newbie language for that?
As you can see, this newbie's got tons of questions. Perhaps some of these questions are stupid, but I really want to learn. I appreciate all the help I'm getting from the people here! Thanks guys!
| [reply] |
|
First and foremost, do not confuse SQL as a language or database format. While it has the word "language" in its name, it's really a standard for obtaining, and futzing with data across many different development tools (languages) and database formats/Products (MySQL, Oracle, Sybase, PostGres, Excess, xBASE, Paradox, yadda, yadda, yadda).
SQL is a standard for manipulating and defining databases. It has two types of statements:
- Data Manipulation Language (DML): SELECT, UPDATE, INSERT, and DELETE.
- Data Definition Language (DDL): CREATE, DROP, ALTER, GRANT.
Each statement is aptly named and operates against an object that generally makes sense.
A decent overview will help. (See below)
In your case, you're looking for something along these lines:
select
c.*
from
Catalog c
where
( c."CatalogId" < $lovalue ) AND ( c."CatalogID" > $hivalue )
This returns all records from the Catalog table that have Catalog ID's in the range specified by the $lovalue and $hivalue variables. (The specific syntax will depend greatly on the database format you choose. For example MySQL has slight syntax variations over Oracle and ODBC.)
With most DBI drivers, this will give you an array of matched records, that you can then print very quickly using whatever approach you're most comfortable with.
To gain the cross platform success you're looking for, you need to choose your database format with care. I recommend starting with mySQL, for though it doesn't completely offer the features of a complete client/remote server database format, it is cross platform, easy to install, and free. Should you eventually outgrow it, you should be able to convert your SQL to your new format with little to no trouble.
Like chess, the basics of SQL are easy to learn. Those eight statements I listed earlier are (arguably) pretty much the heart of SQL itself. It gets a little harder when you start learning the extensions and features of your database format. For example, many remote database servers support stored procedures and triggers, which are (essentially) chunks of code run by the database server to validate (and respond to) your data values and what the users does with them. This can be quite daunting, but it can be very powerful. Learn it as you can.
Personally, I wouldn't worry about mastering Linux before tackling SQL. Get the Cheetah book; it'll help a great deal. It goes into the basics, offers a decent SQL tutorial, and walks you through the process very carefully. Also, it provides a great reference to various DBI drivers, which will help you make the most informed decision regarding the file format.
As far as the latest AM advice is concerned, she (assumed) was offering advice on how to similate the SELECT statement without using SQL.
Assuming that you're storing your data in a hash, she suggested building a second hash that sorted all records in order of your desired sort order. This hash contains two values: the value you're sorting on as the key and a pointer (or reference) to the "real" database record in the original hash. (Note that if you have multiple matches to the same catalog number, the second hash element should be an array of references to each unique record.) That way, you can quickly determine which set of records you're after, simply by examining the keys you're dealing with.
--f | [reply] [d/l] |