Flox Package Database 1
CRUD Operations on Nix Package Metadata
|
CRUD operations on nix
package metadata.
Additional documentation may be found in the <pkgdb>/docs
directory. This includes JSON input/output schemas used by commands such as pkgdb search
and pkgdb resolve
.
Links to additional documentation may be found at the bottom of this file.
Evaluating nix expressions for an entire flake is expensive but necessary for features like package search. This tool provides a way to scrape the data from a flake once and store it in a database for later usage.
The current responsibility of the pkgdb
tool extends only as far as scraping a flake and generating a database. The database should be queried using standard sqlite tools and libraries and all decisions about how and when to generate and update the database are left up to the consumer.
See CONTRIBUTING.md for more information.
Build the database with the scrape
subcommand:
By default, packages will be scraped from packages.[system arch] and stored in ~/.cache
in a database named after the flake fingerprint. These can be overridden as desired:
If the database for a given flake already exists and is asked to re-process an existing package set, it will be skipped. Use --force
to force an update/regeneration.
Once generated, the database can be opened and queried using sqlite3
.
This utility is expected to be run multiple times if a client wishes to "fully scrape all the things" in a flake. This utility is a plumbing command used by a client application, we aren't particularly concerned with the repetitive strain injury a user would suffer if they tried to scrape everything in a flake interactively; rather we aim to do less in a single run and avoid scraping info the caller might not need for their use case.
A given client application that did want to scrape a flake completely would run something along the lines of:
In the example above we the caller would passes in a locked ref, this was technically optional, but is strongly recommended. What's important is that invocations that intend to append to an existing database ABSOLUTELY SHOULD be using locked flake references. In the event that you want to use an unlocked reference on the first call, you can extract a locked flake reference from a database for later runs, but official recommendation is to lock flakes before looping.
If the caller really wants to they could pass an unlocked ref on the first invocation, and yank the locked reference from the resulting database. This is potentially useful for working with local flakes in the event that you don't want to use a utility like nix flake prefetch
or parser-util
to lock your references for you:
The pkgdb get {db,done,flake,id,path}
subcommands expose a handful of special queries for package databases that may be useful for simple scripts. These don't have queries for package metadata, sqlite3
is recommended for these types of queries.
Subcommands:
pkgdb get db
Get absolute path to Package DB for a flakepkgdb get done
Check to see if an attrset and its children has been scrapedpkgdb get flake
Get flake metadata from Package DBpkgdb get id
Lookup an attribute set or package row id
pkgdb get path
Lookup an (AttrSets|Packages).id attribute pathList all known databases and their associated flake information. Accepts the options --cachedir PATH
and --json
. See pkgdb list --help
for more info.
The data is represented in a tree format matching the attrPath
structure. The two entities are AttrSets
(branches) and Packages (leaves). Packages and AttrSets
each have a parentId
, which is always found in AttrSets
. AttrSets.done
(boolean) indicates that an attribute set and all of its children, have been fully scraped and do not need to be reprocessed.
Descriptions are de-duplicated (for instance between two packages for separate architectures) by a Descriptions
table.
DbVersions
and LockedFlake
tables store metadata about the version of pkgdb
that generated the database and the flake which was scraped.
If they are defined explicitly, pname
and version
will be read from the corresponding attributes. Otherwise, they will be parsed from the name
. If version
can be converted to a semver, it will be.
Note that the attrName
for a package is the actual name in the tree.
If outputsToInstall
is not defined, it will be the set of outputs
up to and including "out"
.
Each locked flake has its own database keyed using a unique fingerprint. The separation between these databases simplifies change detection and handling of overridden inputs to flakes. These fingerprints are identical to those used by nix
to create its own eval caches.
Some commands allow database paths to be explicitly set with --database
, while those which act on multiple databases will place databases under the environment variable PKGDB_CACHEDIR
if it is set, otherwise the directory ${XDG_CACHE_HOME:-$HOME/.cache}/flox/pkgdb-v<SCHEMA-MAJOR>
is used.
Because each unique locked flake has its own database, over time these databases will accumulate and require garbage collection.
At this time there is no automated garbage collection mechanism, but simply deleting you cache directory will suffice.
Several commands such as pkgdb search
and pkgdb manifest
take an option --ga-registry
which changes the behavior of registry constructs to contain only a single input which provides nixpkgs=github:NixOS/nixpkgs/release-23.05
.
When --ga-registry
is provided, it is an error for users to write env-base
or registry
fields.
In the future this flag will be removed allowing users to set custom registries with multiple inputs or multiple branches.
For the purposes of testing we have provided an environment variable _PKGDB_GA_REGISTRY_REF_OR_REV
where you can provide an alternative git
ref ( tag or branch name ) or a long revision hash. This is used in our test suite.