DragonFly On-Line Manual Pages
    
    
	
KHTTP_PARSE(3)        DragonFly Library Functions Manual        KHTTP_PARSE(3)
NAME
     khttp_parse, khttp_parsex - parse a CGI instance for kcgi
LIBRARY
     library "libkcgi"
SYNOPSIS
     #include <stdint.h>
     #include <kcgi.h>
     enum kcgi_err
     khttp_parse(struct kreq *req, const struct kvalid *keys, size_t keysz,
         const char *const *pages, size_t pagesz, size_t defpage);
     enum kcgi_err
     khttp_parsex(struct kreq *req, const struct kmimemap *suffixes,
         const char *const *mimes, size_t mimemax, const struct kvalid *keys,
         size_t keysz, const char *const *pages, size_t pagesz,
         size_t defmime, size_t defpage, void *arg,
         void (*argfree)(void *arg), unsigned int debugging);
     extern const char *const kmimetypes[KMIME__MAX];
     extern const char *const khttps[KHTTP__MAX];
     extern const char *const kschemes[KSCHEME__MAX];
     extern const char *const kresps[KRESP__MAX];
     extern const char *const kmethods[KMETHOD__MAX];
     extern const struct kmimemap ksuffixmap[];
     extern const char *const ksuffixes[KMIME__MAX];
DESCRIPTION
     The khttp_parse and khttp_parsex functions parse and validate input and
     the HTTP environment (compression, paths, MIME types, and so on).  It is
     the central function in the kcgi(3) library, parsing and validating key-
     value form (query string, message body, cookie) data and opaque message
     bodies.
     The collective arguments are as follows:
     arg     A pointer to private application data.  It is not touched unless
             argfree is provided.
     argfree
             Function invoked with arg by the child process starting to parse
             untrusted network data.  This makes sure that no unnecessary data
             is leaked into the child.
     debugging
             This bit-field sets debugging of the underlying parse and/or
             write routines.  Debugging messages are sent to stderr and
             consist of the process ID, a colon, then the logged data.  Logged
             data consists of printable ASCII characters and spaces.  A
             newline will flush the existing line.  There are at most BUFSIZ
             characters per line.  Other characters are either escaped (\v,
             \r, \b) or replaced with a question mark.  If the
             KREQ_DEBUG_WRITE bit is set, write operations directly or
             indirectly via khttp_write(3) will be logged.  When the request
             is torn down with khttp_free(3), the process ID and total logged
             bytes are printed on their own line.  If the KREQ_DEBUG_READ_BODY
             bit is set, the entire input body is logged.  The total byte
             count is printed on its own line afterward.
     defmime
             If no MIME type is specified (that is, there's no suffix to the
             page request), use this index in the mimes array.
     defpage
             If no page was specified (e.g., the default landing page), this
             is provided as the requested page index.
     keys    An array of input and validation fields.
     keysz   The number of elements in keys.
     mimemax
             The MIME index used if no MIME type was matched.
     mimes   An array of MIME types (e.g., "text/html"), mapped into a MIME
             index during MIME body parsing.  This relates both to pages and
             input fields with a body type.
     pages   An array of recognised pathnames.  When pathnames are parsed,
             they're matched to indices in this array.
     pagesz  The number of pages in pages.  Also used if the requested page
             was not in pages.
     req     Fill with input fields and HTTP context parsed from the CGI
             environment.  This is the main structure carried around in a
             kcgi(3) application.
     suffixes
             Define the MIME type (suffix) mapping.
     The first form, khttp_parse, is for applications using the system-
     recognised MIME types.  This should work well enough for most
     applications.  It is equivalent to invoking the second form,
     khttp_parsex, as follows:
           khttp_parsex(req, ksuffixmap,
             kmimetypes, KMIME__MAX, keys, keysz,
             pages, pagesz, KMIME_TEXT_HTML,
             defpage, NULL, NULL, 0);
     The req object filled in by khttp_parse or khttp_parsex must be
     subsequently freed by khttp_free.
   Types
     A struct kreq object is filled in by khttp_parse and khttp_parsex.  It
     consists of the following fields:
     arg     Private application data.  This is set during khttp_parse().
     auth    Type of "managed" HTTP authorisation, if any.  This is digest
             (KAUTH_DIGEST) or basic (KAUTH_BASIC) authorisation performed by
             the web server.  See the rawauth field for raw authorisation
             requests.  If a managed authorisation is specified but with
             unknown type (i.e., not digest or basic authentiation), this is
             set to KAUTH_UNKNOWN.
     cookies
             All key-value pairs read from user cookies.
     cookiemap
             Entries in successfully-parsed (or un-parsed) cookies mapped into
             field indices as defined by the keys argument to khttp_parse().
     cookienmap
             Entries in unsuccessfully-parsed (but still attempted) cookies
             mapped into field indices as defined by the keys argument to
             khttp_parse().
     cookiesz
             The size of the cookies array.
     fields  All key-value pairs read from the requests (query string,
             cookies, message body).
     fieldmap
             Entries in successfully-parsed (or un-parsed) fields mapped into
             field indices as defined by the keys arguments to khttp_parse().
     fieldnmap
             Entries in unsuccessfully-parsed (but still attempted) fields
             mapped into field indices as defined by the keys argument to
             khttp_parse().
     fieldsz
             The number of elements in the fields array.
     fullpath
             The full path following the server name or NULL if there is no
             path following the server.  For example, if foo.cgi/bar/baz is
             the PATH_INFO, this would be /bar/baz.
     host    The host-name (i.e., the host of the web application) request
             passed to the application.  This shouldn't be confused with the
             application host's canonical name.
     method  The KMETHOD_ACL, KMETHOD_CONNECT, KMETHOD_COPY, KMETHOD_DELETE,
             KMETHOD_GET, KMETHOD_HEAD, KMETHOD_LOCK, KMETHOD_MKCALENDAR,
             KMETHOD_MKCOL, KMETHOD_MOVE, KMETHOD_OPTIONS, KMETHOD_POST,
             KMETHOD_PROPFIND, KMETHOD_PROPPATCH, KMETHOD_PUT, KMETHOD_REPORT,
             KMETHOD_TRACE, or KMETHOD_UNLOCK submission method.  If the
             method was not understand, KMETHOD__MAX is used.  If no method
             was used, the default is KMETHOD_GET.
             Note: applications will usually accept only KMETHOD_GET and
             KMETHOD_POST, so be sure to emit a KHTTP_405 status for non-
             conforming methods.
     kdata   Internal data.  Should not be touched.
     keys    Value passed to khttp_parse().
     keysz   Value passed to khttp_parse().
     mime    The MIME type of the requested file as determined by its suffix
             matched to the mimemap map passed to khttp_parsex() or the
             default kmimemap if using khttp_parse().  This defaults to the
             mimemax value passed to khttp_parsex() or the default KMIME__MAX
             if using khttp_parse() when no suffix is specified or when the
             suffix is specified but not known.
     page    The page index as defined by the pages array passed to
             khttp_parse() and parsed from the requested file.  This is the
             first path component!  The default page provided to khttp_parse()
             is used if no path was specified or pagesz if the path failed
             lookup.
     pagename
             The string corresponding to page.
     port    The server's receiving TCP port.
     path    The path (or empty string) following the parsed component
             regardless of whether it was located in the path array provided
             to khttp_parse().  For example, if the PATH_INFO is
             foo.cgi/bar/baz.html, the path component would be baz (with the
             leading slash stripped).
     pname   The script name (which may be an empty string in degenerate
             cases) passed to the server.  This may not reflect a file-system
             entity if re-written by the web server.
     rawauth
             If the web server passes the "Authorization" header (which, for
             example, Apache doesn't by default), then the header is parsed
             into this field, which is of type struct khttpauth.
     remote  The string form of the client's IPV4 or IVP6 address.
     reqmap  Mapping of enum krequ enumeration values to reqs parsed from the
             input stream.
     reqs    List of all HTTP request headers, known via enum krequ and not
             known, parsed from the input stream.
     reqsz   Number of request headers in reqs.
     scheme  The access scheme, which is either KSCHEME_HTTP or KSCHEME_HTTPS.
             The scheme defaults to KSCHEME_HTTP if not specified by the
             request.
     suffix  The suffix part of the PATH_INFO or NULL if none exists.  For
             example, if the PATH_INFO is foo.cgi/bar/baz.html, the suffix
             would be html.  See the mime field for the MIME type parsed from
             the suffix.
     The application may optionally define keys provided to khttp_parse and
     khttp_parsex as an array of struct kvalid.  This structure is central to
     the validation of input data.  It consists of the following fields:
     name    The field name, i.e., how it appears in the HTML form input name.
             This cannot be NULL.  If the field name is an empty string and
             the HTTP message consists of an opaque body (and not key-value
             pairs), then that field will be used to validate the HTTP message
             body.  This is useful for KMETHOD_PUT style requests.
     valid   Validating function.  This function accepts a single struct kpair
             * argument and returns an int.  If the function is NULL, then no
             validation is performed and the data is considered as always
             valid.  If you provide your own valid function, it must set the
             field and parsed variables in the key-value pair.  You can also
             allocate new memory for the val and thus valsz: if the value of
             val changes during your validation, the new value will be freed
             with free(3) after being passed out of the sandbox.  Note: these
             functions are invoked from within a system-specific sandbox.  You
             should assume that you cannot invoke any "invasive" system calls
             such as opening files, sockets, etc.  In other words, these must
             be pure computation.
     The struct kpair structure presents the user with fields parsed from
     input and (possibly) matched to the keys variable passed to khttp_parse
     and khttp_parsex.  It is also passed to the validation function to be
     filled in.  In this case, the MIME-related fields are already filled in
     and may be examined to determine the method of validation.  This is
     useful when validating opaque message bodies.
     ctype   The value's MIME content type (e.g., image/jpeg), or NULL if not
             defined.
     ctypepos
             If ctype is not NULL, it is looked up in the mimes parameter
             passed to khttp_parsex or ksuffixmap if using khttp_parse.  If
             found, it is set to the appropriate index.  Otherwise, it's
             mimesz.
     file    The value's MIME source filename or NULL if not defined.
     key     The nil-terminated key (input) name.  If the HTTP message body is
             opaque (e.g., KMETHOD_PUT), then an empty-string key is cooked
             up.
     keypos  If looked up in the keys variable passed to khttp_parse, the
             index of the looked-up key.  Otherwise keysz.
     next    In a cookie or field map, next points to the next parsed key-
             value pair with the same key name.  This occurs most often in
             HTML checkbox forms, where many fields may have the same name.
     parsed  The parsed, validated value.  These may be integer, for a 64-bit
             signed integer; string, for a nil-termianted character string; or
             double, for a double-precision floating-point number.  This is
             intentionally basic because the resulting data must be reliably
             passed from the parsing context back into the web application.
     state   The validation state: whether validated by a parse, invalidated
             by a parse, or non-validated (unparsed).
     type    If parsed, the type of data in parsed, otherwise KFIELD__MAX.
     val     The (input) value, which is always nil-terminated, but if the
             data is binary, nil terminators may occur before the true data
             length of valsz.
     valsz   The true length of val.
     xcode   The value's MIME content transfer encoding (e.g., base64), or
             NULL if not defined.
     The struct khttpauth structure holds authorisation data if passed by the
     server.  If no data was passed by the server, the type value is
     KAUTH_NONE.  Otherwise it's KAUTH_BASIC or KAUTH_DIGEST, with
     KAUTH_UNKNOWN if the authorisation type was not recognised.  The specific
     fields are as follows.
     authorised
             For KAUTH_BASIC or KAUTH_DIGEST authorisation, this field
             indicates whether all required values were specified.
     d       A union containing parsed fields per type: basic for KAUTH_BASIC
             or digest for KAUTH_DIGEST.
     If the field for an HTTP authorisation request is KAUTH_BASIC, it will
     consist of the following for its parsed entities in its struct khttpbasic
     structure:
     response
             The hashed and encoded response string.
     If the field for an HTTP authorisation request is KAUTH_DIGEST, it will
     consist of the following in its struct khttpdigest structure:
     alg     The encoding algorithm, parsed from the possible MD5 or MD5-Sess
             values.
     qop     The quality of protection algorithm, which may be unspecified,
             Auth or Auth-Init.
     user    The user coordinating the request.
     uri     The URI for which the request is designated.  (This must match
             the request URI).
     realm   The request realm.
     nonce   The server-generated nonce value.
     cnonce  The (optional) client-generated nonce value.
     response
             The hashed and encoded response string, which entangled fields
             depending on algorithm and quality of protection.
     count   The (optional) cnonce counter.
     opaque  The (optional) opaque string requested by the server.
     Lastly, the struct khead structure holds parsed HTTP headers.
     key     Holds the HTTP header name.  This is not the CGI header name
             (e.g., HTTP_COOKIE), but the reconstituted HTTP name (e.g.,
             Coookie).
     val     The opaque header value, which may be an empty string.
   Variables
     A number of variables are defined <kcgi.h> to simplify invocations of the
     khttp_parse family.  Applications are strongly suggested to use these
     variables (and associated enumerations) in khttp_parse instead of
     overriding them with hand-rolled sets in khttp_parsex.
     kmimetypes
             Indexed list of common MIME types, for example, "text/html" and
             "application/json".  Corresponds to enum kmime enum khttp.
     khttps  Indexed list of HTTP status code and identifier, for example,
             "200 OK".  Corresponds to enum khttp.
     kschemes
             Indexed list of URL schemes, for example, "https" or "ftp".
             Corresponds to enum kscheme.
     kresps  Indexed list of header response names, for example,
             "Cache-Control" or "Content-Length".  Corresponds to enum kresp.
     kmethods
             Indexed list of HTTP methods, for example, "GET" and "POST".
             Corresponds to enum kmethod.
     ksuffixmap
             Map of MIME types defined in enum kmime to possible suffixes.
             This array is terminated with a MIME type of KMIME__MAX and name
             NULL.
     ksuffixes
             Indexed list of canonical suffixes for MIME types corresponding
             to enum kmime.  Note: this may be a NULL pointer for types that
             have no canonical suffix, for example.
             "application/octet-stream".
RETURN VALUES
     khttp_parse and khttp_parsex return an error code:
     KCGI_OK
          Success (not an error).
     KCGI_ENOMEM
          Memory failure.  This can occur in many places: spawning a child,
          allocating memory, creating sockets, etc.
     KCGI_ENFILE
          Could not allocate file descriptors.
     KCGI_EAGAIN
          Could not spawn a child.
     KCGI_FORM
          Malformed data between parent and child whilst parsing an HTTP
          request.  (Internal system error.)
     KCGI_SYSTEM
          Opaque operating system error.
     On failure, the calling application should terminate as soon as possible.
     Applications should not try to write an HTTP 505 error or similar, but
     allow the web server to handle the empty CGI response on its own.
SEE ALSO
     kcgi(3), khttp_free(3)
AUTHORS
     The khttp_parse and khttp_parsex functions were written by Kristaps
     Dzonsons <kristaps@bsd.lv>.
DragonFly 6.5-DEVELOPMENT       January 4, 2016      DragonFly 6.5-DEVELOPMENT