Have you ever wondered about what those strange, long web addresses actually mean? For example,
https://www.example.com/search?q=how+to+read+web+addresses&ie=utf-8. You’ve likely heard parts of a URL, but what about all those other numbers, letters, and symbols? At some point, you might have wondered how to read a web address.
To read a web address or uniform resource locator (URL), users need to know the pieces that inform a browser how to retrieve a web page. The schema, domain, and path are the primary components of a website address. Other URL elements locate additional servers or specific page resources.
The first part or schema (also known as the protocol) instructs your browser on handling what’s next. The domain name helps locate the server on the internet. The path locates the files on the server. That’s not all: Let’s dig deeper into each of these and detail the other parts, so you can get a better understanding of what each part of a URL does in plain English.
The Scheme Or Protocol
A protocol is a set of rules that need to be followed. The name “protocol” comes from the word “proto-” which means first. This might explain why every web address begins with a protocol.
In a network, the scheme or protocol helps computers share data and communicate by describing the rules and standards of exchanging and handling data.
Most people know the two most common schemes for web pages –
https://. The former is used on websites that are not secure, while the secure counterpart uses an encryption method so no one but those intended can read the data!
Other common schemes or protocols
ftp://– for transferring files between computers
mailto://– for sending email messages
file://– for locating files on local or network file system
There is an extensive list of addressable protocols, some of which are pretty obscure, while others, like the above, you may have already seen a few times.
Analogy: If the web address represents a trip to your home, the protocol or schema could represent your mode of transportation. Are you getting there in a car, plane, train, boat? The next thing you need to know to get home is where specifically your property is.
The Subdomain (If Present)
After the scheme or protocol, you will sometimes see a subdomain.
A subdomain is just like it sounds; it is part of your domain name. For example, if your domain name was
www would be your subdomain.
Analogy: The subdomain is your dwelling, a house, apartment, flat, dock, etc.
You can also have multiple subdomains in one website address. For example,
blog.example.com. Occasionally, several subdomains are strung together, like
api.app begins to semantically describe services that support other services.
Use of subdomains
The subdomain is used to refer to specific types of services on their site.
blog.-blog posts and blogging platforms like WordPress
app.– a web application
forum.– discussion forums, message boards, and online chat services
m.– websites designed for mobile devices
There’s no defined industry standard for what subdomains to use. The developers might use a subdomain structure to refine how to connect servers. Marketers will choose subdomains that fit their search engine optimization (SEO) goals. Other times, it’s purely an aesthetic decision.
Is there a difference between visiting a site with www and without?
There is no difference when visiting a site starting with “www” and one without this subdomain. The use of www was an early convention meaning “worldwide web,” denoting the content was intended for public viewing. The use of non-www is now a stylistic decision and does not impact traffic routing.
There’s not much difference, but there are some arguments for both sides.
Some consider the
www prefix redundant since it just points to the same website. In fact, some sites simply redirect
www removing it. It’s also a little more effort for your browser to type in www and then the domain name.
Other companies prefer to use the
www to specifically refer to their marketing or informational material. This helps segment their audience, organize site functionality, and assist in reporting.
In the end, it’s up to you whether you want to use www or not. The domain name is the more critical and recognizable part of a web address.
The Domain Name
The domain name designates the company or organization and generally has a brand aspect. A good domain name reflects who or what is being promoted or represented.
A website’s domain name can have a label up to 63 characters long, but most are much shorter than that. This label is the
example in our
Analogy: The domain is the city in your journey home. It’s hard to know which Main Street your home is located on without the specific town.
The IP address
The IP address is the numerical address of the server that hosts the website.
An IP version 4 address (still the most common) looks like this:
192.168.1.87 while the new IP version 6 address looks like
The IP address is used by computers to communicate on a network. When you type in a web address into your browser, your computer uses a directory (Analogy: like a phone or address book) to find the IP address. The browser then starts asking around to find the server that hosts the website server.
The role of a DNS lookup
A DNS lookup is the process of translating a website address into an IP address.
The Domain Name System (DNS) is a global network of servers that translates domain names into IP addresses.
When you type in a website address into your browser, your computer will first send a request to a DNS server. The DNS server will then look up or ask around for the IP address associated with the requested website. This result is sent back to your computer. Your computer will then use the IP address to connect to the website’s server.
The Top Level Domain (TLD)
The domain name consists of several parts, including the top-level domain. The TLD is the final set of characters after the last dot. The most common examples are
.org, but more than 1,000 registered TLDs exist. Domain names are important for websites.
Analogy: Top-level domains are the country, state, or province of our journey to your destination. The world’s most used city name is San José (or San Jose). But, are you trying to get to San José, Costa Rica, San José in The Philippines, or San José in California, USA?
Generic top-level domains are the suffixes we’ve all seen most frequently. These original TLDs hold the majority of websites in use
.com(commercial) – the most widely used and recognized TLD, which can be used by any commercial or personal entity
.org(organization) – intended to be used by non- or not-for-profits sites
.net(network) – originally meant to house networks of servers or websites but now less constrained
.edu(education) – restricted for use by higher education institutions
.gov(government) – can only be used by national and state governments of the United States
.mil(military) – refers to sites and services for the US military branches
.int(international) – used very infrequently with the introduction of country-specific TLDs
Country-code top-level domains (ccTLD) are two-character TLDs issued to entities located in or associated with a specific sovereign country, state, or territory. The convention follows the list of ISO 3166-1 alpha-2 country codes. While not all the country codes are represented, ccTLDs began issuance in 1985, but the list has expanded over time.
Examples of ccTLDs:
.nlNetherlands / Nederland (in Dutch)
.deGermany / Deutschland (in German)
.chSwitzerland / Confoederatio Helvetica (from Latin means Swiss Confederation)
As the domain names under the older top-level domains became crowded, the Internet Assigned Numbers Authority (IANA) began asking the public for suggestions for more generic top-level domains. The TLD root database is continuously growing with not just ccTLDs but now theme and corporate names.
For example, Google has
.gov. I predict more companies will propose and receive authorization to add their brands to the root list.
As the IANA experiments with issuing more gTLDs, a new breed of decentralized top-level domains is emerging.
With the gain in the trust of blockchain technology, a new way of issuing TLDs outside the control of a centralized organization is becoming popular. Web3 TLDs are federated or decentralized domain name registries.
This new way of using a non-fungible token (NFT) has led to services like Unstoppable Domains. The registrar helps issue NFTs to control TLD ending with
.nft, among others. For example, I registered
mikechu.crypto for fun.
This Web3 evolution as a push to divergence from a central authority is really about consensus. The .onion TLD is a perfect example.
Tor (short for The Onion Router) addresses are a particular type of website address that can only be accessed through the Tor network. They are a safe haven for those who wish to remain anonymous and free from surveillance.
.onion is the Tor top-level domain which was only recognized by IANA in September 2015.
Once you’re at the right website server, your browser has to gather web pages to present. This is where the path of a web address comes into play.
The Path or Page
A path or page is part of an address that specifies the location in a file system where a website’s files are stored.
Paths and pages are both essential parts of your website address. Your browser starts by getting a primary document. Usually an HTML file, this source file contains links to other files needed to display the webpage.
For example, if your browser is pointed to
www.example.com/info/about-us.html, the path is the
/info/, and the page is
Analogy: The path and/or page are like directions on how to get into your home, where are the light switches, and how to make coffee.
Other types of pages can be included in a website’s address, such as resources like style sheets (
.css) and scripts (
.js). A website’s style sheet determines how the website looks, and a website’s script files control the behavior of the website.
A query is a set of key-value pairs which follow a question mark (
?) and are used to pass a more detailed request for information to a web page.
Key-value pairs are simply a pair of words on either side of an equal sign (=). In our original example at the top of this article
q (perhaps short for query) is the key, and
how+to+read+web+addresses is the value.
Several keys and values can be concatenated or combined with an ampersand (
&) to form a query string.
The Fragment Identifier
A fragment identifier is, in my opinion, an under-utilized part of a web address. It is used to point, jump off scroll directly to something specific on a destination webpage.
The fragment identifier begins with a pound or hash symbol (#), followed by a specific identifier coded into a webpage element.
For example, suppose you’re on an “about us” page. In that case, a link with a fragment identifier may automatically scroll the page to a mailing address. The URL or link might look like
https://www.example.com/about-us#mailing-address in your browser’s address bar.
Using the power of text fragments, snippets of text can be highlighted then linked directly without them previously required coding. A shared link with a text fragment will automatically scroll and highlight words on a webpage.
The text fragments syntax is
#~:text=. For example, a link with
#:~:text=Local will scroll and highlight the first occurrence of the word “Local.”
Additional Facts About URLs You’ll Want To Know
In researching for this article, I came across some other interesting questions.
1. Do URLs have a length limitation?
There is no maximum length limitation for Uniform Resource Locators (URLs). The official specification, RFC2616, only requires that the receiving server can handle the request. Browser developers leave the address bar flexible for accepting web addresses and search queries.
Here are the maximum (or displayable) character lengths for each browser.
|Browser||Maximum Address Bar Length|
2. The origins of the URL
The story of the URL begins in the early days of online communication. There were no uniform standards for addresses on the internet in those days. Different computer networks used various protocols, naming conventions, and address formats. As a result, it was difficult to create links that would work across all types of computer networks.
Back in the early 1990s, Tim Berners-Lee was working for CERN. He wanted to create a way to communicate online, regardless of what type of computer they were using or where they were located.
As the World Wide Web became standardized technology in the 1990s, the first web pages were published and uniformly accessible.
3. What do non-English URLs look like?
Non-English or non-Latin-based language URLs look different from what most people expect due to the characters in other languages.
For example, Chinese URLs often use pinyin, a simplified system of written Chinese. While entering the Unicode characters in the address bar will translate them into Punycode for accurate DNS resolution, other Chinese sites simply use numbers for ease of use.