DragonFly On-Line Manual Pages

Search: Section:  


OMINDEX(1)                       User Commands                      OMINDEX(1)

NAME

omindex - Index static website data via the filesystem

SYNOPSIS

omindex [OPTIONS] --db DATABASE [BASEDIR] DIRECTORY

DESCRIPTION

omindex - Index static website data via the filesystem DIRECTORY is the directory to start indexing from. BASEDIR is the directory corresponding to URL (default: DIRECTORY).

OPTIONS

-d, --duplicates=ARG set duplicate handling: ARG can be 'ignore' or 'replace' (default: replace) -p, --no-delete skip the deletion of documents corresponding to deleted files (--preserve-nonduplicates is a deprecated alias for --no-delete) -e, --empty-docs=ARG how to handle documents we extract no text from: ARG can be index, warn (issue a diagnostic and index), or skip. (default: warn) -D, --db=DATABASE path to database to use -U, --url=URL base url BASEDIR corresponds to (default: /) -M, --mime-type=EXT:TYPE assume any file with extension EXT has MIME Content-Type TYPE, instead of using libmagic (empty TYPE removes any existing mapping for EXT; other special TYPE values: 'ignore' and 'skip') -G, --mime-type-match=GLOB:TYPE assume any file with leaf name matching shell wildcard pattern GLOB has MIME Content-Type TYPE (special TYPE values: 'ignore' and 'skip') -F, --filter=M[,[T][,C]]:CMD process files with MIME Content-Type M using command CMD, which produces output (on stdout or in a temporary file) with format T (Content-Type or file extension; currently txt (default), html or svg) in character encoding C (default: UTF-8). E.g. -Fapplication/octet-stream:'strings -n8' or -Ftext/x-foo,,utf-16:'foo2utf16 %f %t' --read-filters=FILE bulk-load --filter arguments from FILE, which should contain one such argument per line (e.g. text/x-bar:bar2txt --utf8). Lines starting with # are treated as comments and ignored. -l, --depth-limit=LIMIT set recursion limit (0 = unlimited) -f, --follow follow symbolic links -i, --ignore-exclusions ignore meta robots tags and similar exclusions -S, --spelling index data for spelling correction -m, --max-size=N[SUFFIX] maximum size of file to index (in bytes or with a suffix of 'K'/'k', 'M'/'m', 'G'/'g') (default: unlimited) --sample=SOURCE what to use for the stored sample of text for HTML documents - SOURCE can be 'body' or 'description' (default: 'body') -E, --sample-size=SIZE maximum size for the document text sample (supports the same formats as --max-size). (default: 512) -T, --title-size=SIZE maximum size for the document title (supports the same formats as --max-size). (default: 128) -R, --retry-failed retry files which omindex failed to extract text from on a previous run --opendir-sleep=SECS sleep for SECS seconds before opening each directory - sleeping for 2 seconds seems to reliably work around problems with indexing files on Microsoft DFS shares. -C, --track-ctime track each file's ctime so we can detect changes to ownership or permissions. --date-terms ignored for forward compatibility with Omega 1.5.x. --no-date-terms don't index D, M and Y prefixed terms to support date range filtering using terms (we now recommend using a value slot for this instead). -v, --verbose show more information about what is happening --overwrite create the database anew (the default is to update if the database already exists) -s, --stemmer=LANG set the stemming language (default: english). Possible values: arabic armenian basque catalan danish dutch earlyenglish english finnish french german german2 hungarian indonesian irish italian kraaij_pohlmann lithuanian lovins nepali norwegian porter portuguese romanian russian spanish swedish tamil turkish (pass 'none' to disable stemming) -h, --help display this help and exit -V, --version output version information and exit Please report bugs at: https://xapian.org/bugs xapian-omega 1.4.22 February 2023 OMINDEX(1)

Search: Section: