|
Flox Package Database 1
CRUD Operations on Nix Package Metadata
|
CRUD operations on nix package metadata.
Additional documentation may be found in the <pkgdb>/docs directory. This includes JSON input/output schemas used by commands such as pkgdb search and pkgdb resolve.
Links to additional documentation may be found at the bottom of this file.
Evaluating nix expressions for an entire flake is expensive but necessary for features like package search. This tool provides a way to scrape the data from a flake once and store it in a database for later usage.
The current responsibility of the pkgdb tool extends only as far as scraping a flake and generating a database. The database should be queried using standard sqlite tools and libraries and all decisions about how and when to generate and update the database are left up to the consumer.
See CONTRIBUTING.md for more information.
Build the database with the scrape subcommand:
By default, packages will be scraped from packages.[system arch] and stored in ~/.cache in a database named after the flake fingerprint. These can be overridden as desired:
If the database for a given flake already exists and is asked to re-process an existing package set, it will be skipped. Use --force to force an update/regeneration.
Once generated, the database can be opened and queried using sqlite3.
This utility is expected to be run multiple times if a client wishes to "fully scrape all the things" in a flake. This utility is a plumbing command used by a client application, we aren't particularly concerned with the repetitive strain injury a user would suffer if they tried to scrape everything in a flake interactively; rather we aim to do less in a single run and avoid scraping info the caller might not need for their use case.
A given client application that did want to scrape a flake completely would run something along the lines of:
In the example above we the caller would passes in a locked ref, this was technically optional, but is strongly recommended. What's important is that invocations that intend to append to an existing database ABSOLUTELY SHOULD be using locked flake references. In the event that you want to use an unlocked reference on the first call, you can extract a locked flake reference from a database for later runs, but official recommendation is to lock flakes before looping.
If the caller really wants to they could pass an unlocked ref on the first invocation, and yank the locked reference from the resulting database. This is potentially useful for working with local flakes in the event that you don't want to use a utility like nix flake prefetch or parser-util to lock your references for you:
The pkgdb get {db,done,flake,id,path} subcommands expose a handful of special queries for package databases that may be useful for simple scripts. These don't have queries for package metadata, sqlite3 is recommended for these types of queries.
Subcommands:
pkgdb get db Get absolute path to Package DB for a flakepkgdb get done Check to see if an attrset and its children has been scrapedpkgdb get flake Get flake metadata from Package DBpkgdb get id Lookup an attribute set or package row idpkgdb get path Lookup an (AttrSets|Packages).id attribute pathList all known databases and their associated flake information. Accepts the options --cachedir PATH and --json. See pkgdb list --help for more info.
The data is represented in a tree format matching the attrPath structure. The two entities are AttrSets (branches) and Packages (leaves). Packages and AttrSets each have a parentId, which is always found in AttrSets. AttrSets.done (boolean) indicates that an attribute set and all of its children, have been fully scraped and do not need to be reprocessed.
Descriptions are de-duplicated (for instance between two packages for separate architectures) by a Descriptions table.
DbVersions and LockedFlake tables store metadata about the version of pkgdb that generated the database and the flake which was scraped.
If they are defined explicitly, pname and version will be read from the corresponding attributes. Otherwise, they will be parsed from the name. If version can be converted to a semver, it will be.
Note that the attrName for a package is the actual name in the tree.
If outputsToInstall is not defined, it will be the set of outputs up to and including "out".
Each locked flake has its own database keyed using a unique fingerprint. The separation between these databases simplifies change detection and handling of overridden inputs to flakes. These fingerprints are identical to those used by nix to create its own eval caches.
Some commands allow database paths to be explicitly set with --database, while those which act on multiple databases will place databases under the environment variable PKGDB_CACHEDIR if it is set, otherwise the directory ${XDG_CACHE_HOME:-$HOME/.cache}/flox/pkgdb-v<SCHEMA-MAJOR> is used.
Because each unique locked flake has its own database, over time these databases will accumulate and require garbage collection.
At this time there is no automated garbage collection mechanism, but simply deleting you cache directory will suffice.
Several commands such as pkgdb search and pkgdb manifest take an option --ga-registry which changes the behavior of registry constructs to contain only a single input which provides nixpkgs=github:NixOS/nixpkgs/release-23.05.
When --ga-registry is provided, it is an error for users to write env-base or registry fields.
In the future this flag will be removed allowing users to set custom registries with multiple inputs or multiple branches.
For the purposes of testing we have provided an environment variable _PKGDB_GA_REGISTRY_REF_OR_REV where you can provide an alternative git ref ( tag or branch name ) or a long revision hash. This is used in our test suite.