5.9.3 URL mapping
The URL mapping in the Daisy Wiki is based on the hierarchical navigation tree. This means that when a document is requested, the navigation tree is consulted to resolve the path. The other way around, when publishing documents, the logical daisy:<document-id> links that occur in documents that are stored in the repository are translated to the path at which they occur in the navigation tree (if a document occurs at multiple locations, the first occurrence -- in a depth-first traversal -- is used).
5.9.3.1 Relation between the navigation tree and the URL space
Each nested node in the navigation tree becomes a part of the path in an URL. The name of the part of the path is the ID of the navigation tree node. What this ID is depends on the type of node:
- for document nodes, by default it is the document ID, unless a custom node ID is specified in the navigation tree
- for group nodes, it is the id of the node specified in the navigation tree, if the id is not specified, a default one is generated (something like g1, g2, and so on)
If you want to have more readable URLs, it is recommended to assign node IDs in the navigation tree. With readable URLs we mean URLs containing meaningful words instead of automatically assigned numbers.
5.9.3.2 Importance of readable URLs?
It is in no way required to assign custom node IDs in the navigation tree. You only need to do this if you want to have readable, meaningful URLs.
Some advantages of having readable URLs is:
- the page may be easier to find (higher ranked) by web search engines such as Google,
- you can guess what the page is about simply by looking at the URL. In contrast, when it is simply a number, that doesn't tell much.
However, URLs containing the raw document IDs also have their advantages:
- you don't have to think about how to call the navigation tree nodes.
- they are more robust to changes in the navigation tree. If you move nodes in the navigation tree (which is usually a very common thing to do), when the URL paths end on the numeric document ID, the document can still be found by using the document ID. This is a result of a general rule when designing URLs: the less meaningful information you put in them, the less likely they are going to break. (When renaming or moving nodes with custom IDs, it is possible to let the old location redirect to the new one. This is currently not possible directly in Daisy, but can be configured in Apache when using Apache in front of Daisy)
It is a good idea to standardise on some conventions when naming navigation tree nodes. For example, use always lower case and separate names consisting of multiple parts with dashes.
If all you want to have are some shortcut URLs for certain documents, independent of where they occur in the navigation tree, you can run Apache in front of the Daisy Wiki and configure redirects over there.
5.9.3.3 How URL paths are resolved in the Daisy Wiki
When a request for a certain path comes in, the Daisy Wiki will ask the navigation tree manager to lookup that path in the navigation tree for the current site. There are a number of possible outcomes:
- the node described by the path exists in the navigation tree and identifies a document. This is the more common case. In this case, the Daisy Wiki will go on to display that page. If a specific branch and language of the document were requested, different from the site's default branch and language or different from the branch and language specified in the navigation tree node, then the site-search algorithm described below is used.
- the node described by the path exists in the navigation tree but identifies a group node. In this case the Daisy Wiki will redirect to the first document child of that node (that is found by doing a depth-first traversal of the group node -- thus first descending child group nodes when encountered).
- the node does not exist in the navigation tree, again multiple possibilities:
- the path ends on a number: then this number is interpreted as a document ID. The site-search algorithm described below is then used to determine what to do.
- the path does not end on a number: a ResourceNotFoundException is thrown, resulting to an error page in the browser.
5.9.3.3.1 Site-search algorithm
The site search algorithm is used each time when a document might be more suited for display in the context of another site, thus when the document has not been found in the current site's navigation tree.
The sites that will be considered in the search can be configured in the
<siteSwitching mode="stay|all|selected"> <site>...</site> ... more <site> elements ... </siteSwitching>
The mode attribute takes one of these values:
- stay: always stay in the current site, no other sites will be searched
- all: consider all available sites (in alphabetical order)
-
selected: consider only a subset of sites, listed using the
<site> child elements. The content of each <site>
element should be the name of a site (the name of a site is the name of the
directory in which it is defined).
This mode is recommended when you have a large number of sites (to improve performance), or when you have a number of sites that are related and you don't want the user to be redirected outside this set of sites. It can also be useful when you want to change the order in which sites are considered.
The site-search algorithm works as follows:
- It loops over all considered sites. If the document variant occurs in the navigation tree of the site, the browser will be redirected to this site and navigation tree path.
- If the end of the sites list is reached and no site has been found where the document occurs in the navigation tree, the browser will be redirected to the first site in the list for which the collection, branch and language matched. If there is no such site, the document will be displayed in the current site.
5.9.3.4 Not all documents must appear in the navigation tree
As a consequence of the above described resolving mechanism, any document can be accessed in the repository even if it does not occur in the navigation tree. Simply use an URL like:
http://host/daisy/mysite/<document-id>
In which <document-id> is the ID of the document you want to retrieve.
After each document URL you can add the extension .html, thus the above could also have been:
http://host/daisy/mysite/<document-id>.html
By default, the Daisy Wiki generates links with a .html extension, since this makes it easier to download a static copy of the site to the file system (otherwise you could have files and directories with the same name, which isn't possible).
Previous