Distributed Computing in Java 9
上QQ阅读APP看书,第一时间看更新

URLs, URLConnections, and the ContentHandler classes

While surfing the internet, we use URLs extensively. As a result, many of us perceive URL as the names of files located on the WWW, but that is not true; a URL could also point to other resources on a network, such as a database query or command output and so on.

A URL can be defined as an acronym for Uniform Resource Locator and is a reference (an address) to a resource on the internet.

Every URL has two main components: a protocol identifier and resource name. Suppose we have http://www.google.com; in this case, HTTP is the protocol identifier and www.google.com is the resource name. They are joined together with a colon (:) followed by two slashes (//), as shown in the following screenshot:

Protocols could be of many types, such as HTTP, HTTPS, file, gopher, FTP, and news. Here, the resource name represents the full address of the resource and contains the hostname, filename, port number, and reference. In many URLs, you might have seen that the hostname is mandatory, whereas the filename, port number, and reference are optional.

URL can be constructed in Java using the following syntax; note that URL addresses are passed in the form of a string to the URL class constructor:

URL urlHandle = new URL("http://example.com/");

The URL shown in the preceding code is the absolute URL where all the details are given. In addition to this, URLs could be relative to some already existing URL, as shown in the following example:

URL urlHandle = new URL("http://example.com/pages/");
URL firstPage = new URL(urlHandle, "page1.html");