Encoding CORDRA™ Identifiers in URI Syntax
Document Information
Status: Draft
Version: V1.00.20050101
Revision Date: 2005-01-01
Type: Specification/Profile
CORDRA ID:
H
[2000.01/FFEE9F72B00C4189B137ECD34188B94E]
Metadata ID:
H
[2000.01/A3D8BE7457C943FFB66ED4583059A8BA]
Abstract
This document describes the mechanism used in CORDRA to encode a Handle, i.e., a CORDRA identifier, within URI syntax.
CORDRA uses Handles as identifiers. The specification of the Handle System (RFCs 3650, 3651, 3652) does not describe how to encode Handles for use on the Internet. This document describes alternative ways to encode a Handle in general, and CORDRA identifiers, within URI syntax [RFC 2396].
The description of URI encoding of CORDRA identifiers SHALL apply to all identifiers used within the CORDRA system, the CORDRA System instance or implementation of the CORDRA system, or a definition, instance or implementation of Federated CORDRA.
The description of URI encoding of CORDRA identifiers SHALL also apply to all Handles used as identifiers within any CORDRA instance or implementation.
Identifiers SHALL be valid Handles as defined in RFCs 3650, 3651, 3562. The Handle syntax consists of two parts, a naming authority or prefix and a local name or suffix delimited by the ASCII character "/"
This specification uses the Augmented Backus-Naur Form (ABNF) notation of RFC 2234.
<handle> = <naming authority> "/" <local name>
Handle encoding within a URI uses a subset of the URI syntax. The syntax below is customized to eliminate parts of the URI syntax not used to encoded Handles.
<URI> = <scheme> ":" <hierarchy> [ "?" <query>] [ "#" <fragment> ]
<hierarchy> = "//" <host> "/" <path> / <path>
This syntax permits two versions of a URI, designated as host encoding and path encoding respectively, e.g.:
<scheme> :// <host;> / <path>
<scheme> : <path>
In all URI encodings of a Handle, the URI scheme SHALL be hdl. The URI scheme SHALL be case insensitive.
The syntax for URI host encoding of a Handle SHALL include a URI host component, a URI path component, an optional URI query component and an optional URI fragment component.
hdl ":" "//" <host> / <path> [ "?" <query>] [ "#" <fragment> ]
The URI host component SHALL be used to encode the naming authority component of the Handle. The host component SHALL NOT be empty. The host component SHALL be a registered name in URI syntax.
The URI path component SHALL be used to encode the local name component of the Handle. The path component SHALL NOT be empty. The path component SHALL be a segment in URI syntax.
The query and fragment components SHALL follow URI syntax for URI query and fragment components.
hdl ":" "//" <registered name> / <segment> [ "?" <query>] [ "#" <fragment> ]
For example, the CORDRA identifier with the NA 100.102 and the GUID F58FB49EB1F848f0A606E84CEF294BE5 will be URI encoded as hdl://100.102/F58FB49EB1F848f0A606E84CEF294BE5
The syntax for URI path encoding of a Handle SHALL include a URI path component, an optional URI query component and an optional URI fragment component.
hdl ":" <path> [ "?" <query>] [ "#" <fragment> ]
The URI path component SHALL consist of exactly two URI segments. The first URI segment SHALL be used to encode the naming authority component of the handle. The second URI segment SHALL be used to encode the local name component of the Handle. Both segments SHALL NOT be empty.
The query and fragment components SHALL follow URI syntax for URI query and fragment components.
hdl ":" <segment> / <segment> [ "?" <query>] [ "#" <fragment> ]
For example, the CORDRA identifier with the NA 100.102 and the GUID F58FB49EB1F848f0A606E84CEF294BE5 will be URI encoded as hdl:100.102/F58FB49EB1F848f0A606E84CEF294BE5
URI suffix references, i.e., when the the URI scheme component is not present in the URI (hdl is not present), are ambiguous and SHALL NOT be used in machine processable URI encodings of Handles. URI suffix references MAY be used in human readable documents for encoding of Handles when it is obvious that the text represents a Handle.
The naming authority and local name components of a handle may be contain any printable characters from the Universal Character Set (UCS-2) of ISO/IEC 10646. The Handle Protocol specifies encoding in UTF-8 [RFC 2044]. The URI RFC specifies character set encodings. The URI encoding of Handles SHALL use the UTF-8.
A CORDRA Identifier uses a subset of the allowable characters that can appear in a Handle (the set of hexadecimal digits HEXDIG and the ASCII character "."). A CORDRA Identifier can be directly encoded in a URI as UTF-8.
Independent of any URI encoding, the processing and resolution of a handle is still subject to all of the requirements from the Handle Protocol as defined in the Handle RFCs.
HTTP URI Encoding and Protocol Processing
Since direct URI encoding and processing of Handles is not widely deployed throughout the Internet, Handles MAY be encoded with HTTP URIs (URLs) and transmitted between applications using the HTTP protocol [RFC 2616]. It is important to note that the HTTP URI/URL scheme encoding and HTTP protocol may not support all features of the Handle system and the Handle protocol.
The syntax for HTTP URI encoding of a Handle SHALL include a URI authority component, a URI path component, an optional URI query component and an optional URI fragment component.
http ":" "//" <authority> / <path> [ "?" <query>] [ "#" <fragment> ]
The URI authority component SHALL be used to encode the name of the HTTP Handle Resolver.
The URI path component SHALL consist of two or three URI segments. The first optional URI segment SHALL be used to specify that the path is a handle. The second URI segment SHALL be used to encode the naming authority component of the handle. The third URI segment SHALL be used to encode the local name component of the Handle. Both the second and third segments SHALL NOT be empty.
The implementation of an HTTP Handle Resolver SHALL specify the authority and if the first segment is required, its value.
For example, the CORDRA identifier with the NA 100.102 and the GUID F58FB49EB1F848f0A606E84CEF294BE5 using the CNRI Handle Resolver hld.handle.net will be HTTP URI encoded as http://hld.handle.net/100.102/F58FB49EB1F848f0A606E84CEF294BE5
Similarly, the same identifier using an HTTP resolver that specifies the use of three segments, with the first segment being "hdl" will be HTTP URI encoded as http://arrow.resolver.au.gov:2641/hdl/100.102/F58FB49EB1F848f0A606E84CEF294BE5
http ":" "//" <authority> / [ <segment> / ] <segment> / <segment> [ "?" <query>] [ "#" <fragment> ]
| Version | ID | Date | Change Summary |
|---|---|---|---|
| 1.00 | H | 20041018 | Initial release |
| 20050101 | Editorial changes |