DragonFly On-Line Manual Pages

Search: Section:  


KHTTP_PARSE(3)        DragonFly Library Functions Manual        KHTTP_PARSE(3)

NAME

khttp_parse, khttp_parsex - parse a CGI instance for kcgi

LIBRARY

library "libkcgi"

SYNOPSIS

#include <stdint.h> #include <kcgi.h> enum kcgi_err khttp_parse(struct kreq *req, const struct kvalid *keys, size_t keysz, const char *const *pages, size_t pagesz, size_t defpage); enum kcgi_err khttp_parsex(struct kreq *req, const struct kmimemap *suffixes, const char *const *mimes, size_t mimemax, const struct kvalid *keys, size_t keysz, const char *const *pages, size_t pagesz, size_t defmime, size_t defpage, void *arg, void (*argfree)(void *arg), unsigned int debugging); extern const char *const kmimetypes[KMIME__MAX]; extern const char *const khttps[KHTTP__MAX]; extern const char *const kschemes[KSCHEME__MAX]; extern const char *const kresps[KRESP__MAX]; extern const char *const kmethods[KMETHOD__MAX]; extern const struct kmimemap ksuffixmap[]; extern const char *const ksuffixes[KMIME__MAX];

DESCRIPTION

The khttp_parse and khttp_parsex functions parse and validate input and the HTTP environment (compression, paths, MIME types, and so on). It is the central function in the kcgi(3) library, parsing and validating key- value form (query string, message body, cookie) data and opaque message bodies. The collective arguments are as follows: arg A pointer to private application data. It is not touched unless argfree is provided. argfree Function invoked with arg by the child process starting to parse untrusted network data. This makes sure that no unnecessary data is leaked into the child. debugging This bit-field sets debugging of the underlying parse and/or write routines. Debugging messages are sent to stderr and consist of the process ID, a colon, then the logged data. Logged data consists of printable ASCII characters and spaces. A newline will flush the existing line. There are at most BUFSIZ characters per line. Other characters are either escaped (\v, \r, \b) or replaced with a question mark. If the KREQ_DEBUG_WRITE bit is set, write operations directly or indirectly via khttp_write(3) will be logged. When the request is torn down with khttp_free(3), the process ID and total logged bytes are printed on their own line. If the KREQ_DEBUG_READ_BODY bit is set, the entire input body is logged. The total byte count is printed on its own line afterward. defmime If no MIME type is specified (that is, there's no suffix to the page request), use this index in the mimes array. defpage If no page was specified (e.g., the default landing page), this is provided as the requested page index. keys An array of input and validation fields. keysz The number of elements in keys. mimemax The MIME index used if no MIME type was matched. mimes An array of MIME types (e.g., "text/html"), mapped into a MIME index during MIME body parsing. This relates both to pages and input fields with a body type. pages An array of recognised pathnames. When pathnames are parsed, they're matched to indices in this array. pagesz The number of pages in pages. Also used if the requested page was not in pages. req Fill with input fields and HTTP context parsed from the CGI environment. This is the main structure carried around in a kcgi(3) application. suffixes Define the MIME type (suffix) mapping. The first form, khttp_parse, is for applications using the system- recognised MIME types. This should work well enough for most applications. It is equivalent to invoking the second form, khttp_parsex, as follows: khttp_parsex(req, ksuffixmap, kmimetypes, KMIME__MAX, keys, keysz, pages, pagesz, KMIME_TEXT_HTML, defpage, NULL, NULL, 0); The req object filled in by khttp_parse or khttp_parsex must be subsequently freed by khttp_free. Types A struct kreq object is filled in by khttp_parse and khttp_parsex. It consists of the following fields: arg Private application data. This is set during khttp_parse(). auth Type of "managed" HTTP authorisation, if any. This is digest (KAUTH_DIGEST) or basic (KAUTH_BASIC) authorisation performed by the web server. See the rawauth field for raw authorisation requests. If a managed authorisation is specified but with unknown type (i.e., not digest or basic authentiation), this is set to KAUTH_UNKNOWN. cookies All key-value pairs read from user cookies. cookiemap Entries in successfully-parsed (or un-parsed) cookies mapped into field indices as defined by the keys argument to khttp_parse(). cookienmap Entries in unsuccessfully-parsed (but still attempted) cookies mapped into field indices as defined by the keys argument to khttp_parse(). cookiesz The size of the cookies array. fields All key-value pairs read from the requests (query string, cookies, message body). fieldmap Entries in successfully-parsed (or un-parsed) fields mapped into field indices as defined by the keys arguments to khttp_parse(). fieldnmap Entries in unsuccessfully-parsed (but still attempted) fields mapped into field indices as defined by the keys argument to khttp_parse(). fieldsz The number of elements in the fields array. fullpath The full path following the server name or NULL if there is no path following the server. For example, if foo.cgi/bar/baz is the PATH_INFO, this would be /bar/baz. host The host-name (i.e., the host of the web application) request passed to the application. This shouldn't be confused with the application host's canonical name. method The KMETHOD_ACL, KMETHOD_CONNECT, KMETHOD_COPY, KMETHOD_DELETE, KMETHOD_GET, KMETHOD_HEAD, KMETHOD_LOCK, KMETHOD_MKCALENDAR, KMETHOD_MKCOL, KMETHOD_MOVE, KMETHOD_OPTIONS, KMETHOD_POST, KMETHOD_PROPFIND, KMETHOD_PROPPATCH, KMETHOD_PUT, KMETHOD_REPORT, KMETHOD_TRACE, or KMETHOD_UNLOCK submission method. If the method was not understand, KMETHOD__MAX is used. If no method was used, the default is KMETHOD_GET. Note: applications will usually accept only KMETHOD_GET and KMETHOD_POST, so be sure to emit a KHTTP_405 status for non- conforming methods. kdata Internal data. Should not be touched. keys Value passed to khttp_parse(). keysz Value passed to khttp_parse(). mime The MIME type of the requested file as determined by its suffix matched to the mimemap map passed to khttp_parsex() or the default kmimemap if using khttp_parse(). This defaults to the mimemax value passed to khttp_parsex() or the default KMIME__MAX if using khttp_parse() when no suffix is specified or when the suffix is specified but not known. page The page index as defined by the pages array passed to khttp_parse() and parsed from the requested file. This is the first path component! The default page provided to khttp_parse() is used if no path was specified or pagesz if the path failed lookup. pagename The string corresponding to page. port The server's receiving TCP port. path The path (or empty string) following the parsed component regardless of whether it was located in the path array provided to khttp_parse(). For example, if the PATH_INFO is foo.cgi/bar/baz.html, the path component would be baz (with the leading slash stripped). pname The script name (which may be an empty string in degenerate cases) passed to the server. This may not reflect a file-system entity if re-written by the web server. rawauth If the web server passes the "Authorization" header (which, for example, Apache doesn't by default), then the header is parsed into this field, which is of type struct khttpauth. remote The string form of the client's IPV4 or IVP6 address. reqmap Mapping of enum krequ enumeration values to reqs parsed from the input stream. reqs List of all HTTP request headers, known via enum krequ and not known, parsed from the input stream. reqsz Number of request headers in reqs. scheme The access scheme, which is either KSCHEME_HTTP or KSCHEME_HTTPS. The scheme defaults to KSCHEME_HTTP if not specified by the request. suffix The suffix part of the PATH_INFO or NULL if none exists. For example, if the PATH_INFO is foo.cgi/bar/baz.html, the suffix would be html. See the mime field for the MIME type parsed from the suffix. The application may optionally define keys provided to khttp_parse and khttp_parsex as an array of struct kvalid. This structure is central to the validation of input data. It consists of the following fields: name The field name, i.e., how it appears in the HTML form input name. This cannot be NULL. If the field name is an empty string and the HTTP message consists of an opaque body (and not key-value pairs), then that field will be used to validate the HTTP message body. This is useful for KMETHOD_PUT style requests. valid Validating function. This function accepts a single struct kpair * argument and returns an int. If the function is NULL, then no validation is performed and the data is considered as always valid. If you provide your own valid function, it must set the field and parsed variables in the key-value pair. You can also allocate new memory for the val and thus valsz: if the value of val changes during your validation, the new value will be freed with free(3) after being passed out of the sandbox. Note: these functions are invoked from within a system-specific sandbox. You should assume that you cannot invoke any "invasive" system calls such as opening files, sockets, etc. In other words, these must be pure computation. The struct kpair structure presents the user with fields parsed from input and (possibly) matched to the keys variable passed to khttp_parse and khttp_parsex. It is also passed to the validation function to be filled in. In this case, the MIME-related fields are already filled in and may be examined to determine the method of validation. This is useful when validating opaque message bodies. ctype The value's MIME content type (e.g., image/jpeg), or NULL if not defined. ctypepos If ctype is not NULL, it is looked up in the mimes parameter passed to khttp_parsex or ksuffixmap if using khttp_parse. If found, it is set to the appropriate index. Otherwise, it's mimesz. file The value's MIME source filename or NULL if not defined. key The nil-terminated key (input) name. If the HTTP message body is opaque (e.g., KMETHOD_PUT), then an empty-string key is cooked up. keypos If looked up in the keys variable passed to khttp_parse, the index of the looked-up key. Otherwise keysz. next In a cookie or field map, next points to the next parsed key- value pair with the same key name. This occurs most often in HTML checkbox forms, where many fields may have the same name. parsed The parsed, validated value. These may be integer, for a 64-bit signed integer; string, for a nil-termianted character string; or double, for a double-precision floating-point number. This is intentionally basic because the resulting data must be reliably passed from the parsing context back into the web application. state The validation state: whether validated by a parse, invalidated by a parse, or non-validated (unparsed). type If parsed, the type of data in parsed, otherwise KFIELD__MAX. val The (input) value, which is always nil-terminated, but if the data is binary, nil terminators may occur before the true data length of valsz. valsz The true length of val. xcode The value's MIME content transfer encoding (e.g., base64), or NULL if not defined. The struct khttpauth structure holds authorisation data if passed by the server. If no data was passed by the server, the type value is KAUTH_NONE. Otherwise it's KAUTH_BASIC or KAUTH_DIGEST, with KAUTH_UNKNOWN if the authorisation type was not recognised. The specific fields are as follows. authorised For KAUTH_BASIC or KAUTH_DIGEST authorisation, this field indicates whether all required values were specified. d A union containing parsed fields per type: basic for KAUTH_BASIC or digest for KAUTH_DIGEST. If the field for an HTTP authorisation request is KAUTH_BASIC, it will consist of the following for its parsed entities in its struct khttpbasic structure: response The hashed and encoded response string. If the field for an HTTP authorisation request is KAUTH_DIGEST, it will consist of the following in its struct khttpdigest structure: alg The encoding algorithm, parsed from the possible MD5 or MD5-Sess values. qop The quality of protection algorithm, which may be unspecified, Auth or Auth-Init. user The user coordinating the request. uri The URI for which the request is designated. (This must match the request URI). realm The request realm. nonce The server-generated nonce value. cnonce The (optional) client-generated nonce value. response The hashed and encoded response string, which entangled fields depending on algorithm and quality of protection. count The (optional) cnonce counter. opaque The (optional) opaque string requested by the server. Lastly, the struct khead structure holds parsed HTTP headers. key Holds the HTTP header name. This is not the CGI header name (e.g., HTTP_COOKIE), but the reconstituted HTTP name (e.g., Coookie). val The opaque header value, which may be an empty string. Variables A number of variables are defined <kcgi.h> to simplify invocations of the khttp_parse family. Applications are strongly suggested to use these variables (and associated enumerations) in khttp_parse instead of overriding them with hand-rolled sets in khttp_parsex. kmimetypes Indexed list of common MIME types, for example, "text/html" and "application/json". Corresponds to enum kmime enum khttp. khttps Indexed list of HTTP status code and identifier, for example, "200 OK". Corresponds to enum khttp. kschemes Indexed list of URL schemes, for example, "https" or "ftp". Corresponds to enum kscheme. kresps Indexed list of header response names, for example, "Cache-Control" or "Content-Length". Corresponds to enum kresp. kmethods Indexed list of HTTP methods, for example, "GET" and "POST". Corresponds to enum kmethod. ksuffixmap Map of MIME types defined in enum kmime to possible suffixes. This array is terminated with a MIME type of KMIME__MAX and name NULL. ksuffixes Indexed list of canonical suffixes for MIME types corresponding to enum kmime. Note: this may be a NULL pointer for types that have no canonical suffix, for example. "application/octet-stream".

RETURN VALUES

khttp_parse and khttp_parsex return an error code: KCGI_OK Success (not an error). KCGI_ENOMEM Memory failure. This can occur in many places: spawning a child, allocating memory, creating sockets, etc. KCGI_ENFILE Could not allocate file descriptors. KCGI_EAGAIN Could not spawn a child. KCGI_FORM Malformed data between parent and child whilst parsing an HTTP request. (Internal system error.) KCGI_SYSTEM Opaque operating system error. On failure, the calling application should terminate as soon as possible. Applications should not try to write an HTTP 505 error or similar, but allow the web server to handle the empty CGI response on its own.

SEE ALSO

kcgi(3), khttp_free(3)

AUTHORS

The khttp_parse and khttp_parsex functions were written by Kristaps Dzonsons <kristaps@bsd.lv>. DragonFly 6.5-DEVELOPMENT January 4, 2016 DragonFly 6.5-DEVELOPMENT

Search: Section: