Database
Perl has Database support through the DBI package. It abstracts basic operations into the library so your code doesn’t need to be changed when you switch database software. 1)
#!/usr/bin/perl -w use strict; use DBI; use Data::Dumper; ### MySQL Connection # my $dbh = DBI->connect('DBI:mysql:allen:localhost','allen') # or die "Can't connect to db: $dbh->errstr\n"; ### Postgres Connection with advanced usage my $dbName='allen'; my $host='localhost'; my $dbUser=$dbName; my $dbPassword=''; my $dbh = DBI->connect("DBI:Pg:dbname=$dbName;host=$host", $dbUser, $dbPassword, { RaiseError => 0, AutoCommit => 0, PrintError => 1 }) or die "Can't connect to db\n"; ### this is my function, defined next my $rows = selectRows( "SELECT * from users where user_id=?", 1); print Dumper($rows); $dbh->commit(); $dbh->disconnect(); exit; ##### selectRows ##### sub selectRows { my ($sql, @parms) = @_; my $sth; eval { $sth = $dbh->prepare($sql); $sth->execute(@parms) or die $DBI::errstr; }; if ($@) { ## Exception die "Error $@ during $sql @parms\n"; } my $data=[]; while (my $hr = $sth->fetchrow_hashref) { push @$data, {%$hr}; # Copies row data } $sth->finish(); return $data; # Returns $data->[$rownum]{colname} } sub runSQL { my ($sql, @parms) = @_; my $sth; eval { $sth = $dbh->prepare($sql); $sth->execute(@parms) or die $DBI::errstr; }; if ($@) { ## Exception die "Error $@ during $sql @parms\n"; } $sth->finish(); return $sth->rows; # Returns number of affected rows }
Cursors
Cursors are a concept in relational databases that allow the database to return a single row at a time to the application for processing. A cursor is opened with runtime parameters supplied to it, and each row is returned with a “fetch” request , which returns a end-of-data condtion (SQLCODE=100 anybody?), after which the cursor is closed.
Think of this like an unbuffered file read, where each ‘read’ requests returns the next block of information from the opened file. The concept of newline-delimited records throws off this analogy, as multiple file “line” may be buffered in or across a file block.
The sample code in the previous section reads all matching rows into the application in a single operation. I recently learned that what I thougt was a Postgres cursor with the Perl DBI (execute, fetch, ..., finish) was *not* a cursor. It loads the entire result set into memory at once, then returns each tuple (record) from that buffered result set. This can be very nasty for large result sets, so you may certainly want to use a real cursor for these large results.
However, having the database return a single record at a time with the cursor method is not as efficient as the “all at once” method. Postgres also you to “block” a set of records to return a chunk at a time, so you don’t have to go running back to the database server each time you want a new row.
Sure, this makes things more complex for the application, but that is where the beauty of database abstraction layers work the best. Write a routine similar to the one below, and save yourself some work.
Whike DBI is indeed a database abstraction, it isn’t the interface for your applications to use. Its goal in to create a Database-Independant Interface so Perl programmers could spend their time coding to that standard instead of the specific implentation of databases and versions. This insulates code from upgrades or ports to other databases... assuming you are using standard SQL syntax. (MySQL is not very compliant about that and you may have to covert statements manually as well. Other databases have their own specific extensions which should be avoided where possible and practical.)
This code adds a routine, loopCursor(), that will call back a block of code to be handled for each record. The results will be buffered in chunks of 1000 records, and the routine will insulate the cursor and blocking logic from your code, as in the previous examples.
loopCursor( sub { my ($row) = @_; print "$row->{name}\n"; }, "SELECT list_id,name from list where list_id>=?",1 ); my $csrnum=0; sub loopCursor { my ($proc, $sql, @parms) = @_; my ($sth, $row); ++$csrnum; my $count=0; eval { runSQL("declare csr_$csrnum cursor for $sql", @parms); for(;;) { $sth = $dbh->prepare("fetch 1000 from csr_$csrnum") or die "fetch 1000 from csr $DBI::errstr\n"; $sth->execute() or die "loopCursor fetch $DBI::errstr\n"; last if $sth->rows == 0; while ($row = $sth->fetchrow_hashref) { ++$count; &$proc($row); } $sth->finish(); } runSQL("close csr_$csrnum"); return $count; }; if ($@) { ## Exception die join(' ', "Error $@ during", $sql, @parms, $DBI::errstr||'',"\n"); } }
Now a few other details. Not every database supports cursors that are held open across transactions, and if so, check out the “with hold” clause on select statements for your database.
I could not get anything definite, but it seems that either DBI or Postgres will not allow multiple cursors open simultaneously. I haven’t experimented to see if this is true. Perhaps the same cursor name can not be open multiple times?