May 13, 2009

Code tally....

While this normally falls under the "pissing match" category, I thought this would be fun to show how much code can be shaved off of a project that is written in pure Perl versus C++ and a Shell wrapper. The old genbasedir from Apt-RPM clocks in at 1650 lines of code, not counting the includes in the C++ source files. The NEW genbasedir.pl from APP-Get clocks in at 473 lines of code. A whopping 2/3 code reduction alone. Mind you, I could have cut corners, and trimmed the code down even further, however after looking over the code that genbasedir from Apt-RPM have between the shell script and the two C++ files, there are more than a few areas that look dodgy and lack proper error checking. Tomorrow, I'll move most of the functions from genbasedir.pl to a module to make it unit testable. Stay tuned to further developments :D

May 10, 2009

Reworking genbasedir....

To make genbasedir easier to work on in the future (when we move to using some SQL engine for the data store for metadata), I reworked the functions to allow us to use an integer based pkgId instead of using the string name. This should speed up the dependency map functions in app-get considerably when doing package look ups since string comparisons aren't the fastest operations in the world in any language.

As an aside, I've gotten buy in from Jeff Johnson, the maintainer of the version of RPM at www.rpm5.org to add in some Debian inspired tags to make my work on app-get easier. :)

May 7, 2009

New APP in town :D

Just an update, I'm working on a new dep solver for RPM and the eventual general successor OPM. The utility is called APP-get, a clone of apt-get that doesn't rely on private API in the underlying package manager to work and crufty code that Apt-RPM does. To get a feel of how APP-get will retrieve the metadata, I've been reimplementing genbasedir.

At this time, I've much of the new genbasedir written. It writes out a new plain text database format for binary and source package metadata called a lst.db. Each file is a single table of fields delimited by three pipe characters in a row. With this new set of lst.db files, a person can generate the metadata using standard UNIX tools if they so wish (though I don't know why you'd want to, since genbasedir does such a nice job of it :D).

While writing this, I've been studying the dependency map algorithms used by Apt and SmartPM. Armed with the knowledge of how they determine the dependencies, I should have a fairly easy time of writing the dependency manager in APP-get.

As I've been writing the code, I'm noting a lot of places that I'll be able to clear out Debian-isms from the design, which makes the codebase and design far simpler. At the same time, I'm seeing features I'd like integrated into RPM5 to make some aspects simpler, such as package priorities.

In Apt-RPM priorities are implemented as a kludgy list file (rpmpriorities) in /etc/apt/ that is checked for a string representing the name of a given package that uses a YAML-like format that defines packages according to named priorities which are more like the Essential tag in dpkg. I'd like to change this to be INSIDE the package's header to allow distributions to mark certain packages that don't normally have a require against another as Priority: Essential, or Priority: Normal. Also if this was merged into the header of the package, it would allow rpm to gain the ability to stop people from shooting themselves in the foot with rpm -e or app-get remove.

Well, enough chatter for now. Look forward to more progress on APP-get.