- 1. Overview
- 2. Etymology
- 3. Cultural Impact
Right. You want the Wikipedia article on the World Wide Web, but⌠more. You want it to have a pulse, a shadow. Fine. Donât expect me to hold your hand through it. This is what you get.
World Wide Web
This article is about the global system of pages accessed via HTTP . For the worldwide computer network itself, youâre looking for Internet . And if youâre lost and need a browser, try WorldWideWeb â though I doubt itâs still in fashion.
“WWW” and “The Web”âyes, they redirect here. Don’t get cute with me. If you need disambiguation, try WWW (disambiguation) or The Web (disambiguation) .
World Wide Web Abbreviation: WWW, W3, The Web Status: Active Year started: 1989 (36 years ago) First published: 6 August 1991 (34 years ago) Organization:
- CERN (1989â1994)
- W3C (1994âcurrent) Authors: Tim Berners-Lee
Hereâs a web page from Wikipedia , rendered, I assume, by Google Chrome . Riveting.
The World Wide Web, or WWW, or W3, or just⌠the Web, is this sprawling information system . It lets you share content over the Internet in ways that, thankfully, donât require you to be a computer scientist or some basement-dwelling hobbyist. [1] [2] Itâs all about accessing documents and other web resources through the Internet, using a specific set of rules called the Hypertext Transfer Protocol (HTTP). [3]
Tim Berners-Lee , an English chap, cooked it up at CERN back in 1989. It finally opened its doors to the public in 1993. His vision? A “universal linked information system.” [4] [5] [6] Web servers dish out documents and media, and programs like web browsers slurp them up. Everything on the Web has an address, a uniform resource locator (URL), so you can find it.
The original, and still the most common, format is the web page , written in Hypertext Markup Language (HTML). HTML can handle plain text, images , video , audio , and even little scripts to make things interactive. And, of course, there are hyperlinks âthose embedded URLs that whisk you away to other resources. This whole dance of following links is what they call web navigation , or web surfing. If you want to get fancy, these pages can function as web applications , essentially application software . All this information travels across the Internet via HTTP. A collection of related resources, usually under the same domain name , forms a website . One server can host many sites, and the biggest sites? Theyâre spread across countless servers. The content itself comes from everywhere: companies, governments, individuals. Itâs a colossal, chaotic repository of everything from education to entertainment.
The Web has become the dominant information systems platform on the planet. [7] [8] [9] [10] Itâs how billions of us actually interact with the Internet. [3]
History
⢠Main article: History of the World Wide Web
This NeXT Computer , the one Sir Tim Berners-Lee used at CERN , was the worldâs first Web server . Quite the artifact.
So, Tim Berners-Lee , again, working at CERN . He was drowning in documents, data, and collaborators. Finding, updating, and distributing this mess was a nightmare. He looked at existing systems â the tree structure of Unix , the keyword tagging of VAX/NOTES â and found them wanting. Heâd already tinkered with a private system called ENQUIRE at CERN in 1980. Then he stumbled upon Ted Nelson ’s hypertext ideas from 1965, where documents could be linked willy-nilly through “hot spots.” This confirmed his own direction. [13] [14]
The historic World Wide Web logo, designed by Robert Cailliau . You won’t see it much anymore.
Later, Apple made hypertext popular with HyperCard . But Berners-Leeâs vision was grander from the start: links across independent computers, accessible by anyone on the Internet. He also envisioned handling more than just text â graphics, sound, video. Links could point to data files or even trigger programs on the server. He planned “gateways” to access other systems, like traditional file systems or Usenet . Crucially, he insisted on decentralization. No single entity in control. [5] [15] [11] [12]
Berners-Lee pitched his proposal to CERN in May 1989. No name yet. By the end of 1990, he had a working system: a browser called WorldWideWeb (which also became the project’s name and the network’s name), and an HTTP server running at CERN. He defined the first HTTP protocol, the basic URL structure, and essentially cemented HTML as the primary document format. [16] It trickled out to other institutions in early 1991 and then, on August 23, 1991, to the entire Internet. It caught on, slowly at first, spreading through scientific and academic circles. Within two years, there were already 50 websites . [17] [18]
CERN did everyone a favor and made the Webâs code and protocols available for free on April 30, 1993. Thatâs when things really took off. [19] [20] [21] Then came Mosaic , released by the NCSA later that year. It was graphical, could display images inline, and handle forms. Suddenly, the Web was accessible. Thousands of websites sprouted within a year. [22] [23] Marc Andreessen and Jim Clark founded Netscape the next year, giving us the Navigator browser . They threw in Java and JavaScript , and it dominated. Netscapeâs IPO in 1995 kicked off the dot-com bubble . Microsoftâs answer? Internet Explorer , bundled with Windows, which held the crown for 14 years. [27] [26]
Berners-Lee, meanwhile, started the World Wide Web Consortium (W3C). They churned out XML in 1996 and pushed for XHTML to replace HTML. But the browser developers, particularly those pushing Ajax with XMLHttpRequest , had other ideas. Mozilla , Opera , and Apple backed the WHATWG , which championed HTML5 . The W3C eventually conceded, abandoning XHTML in 2009 and handing over HTML specification control to WHATWG in 2019. [29] [30] [31]
The Web, in essence, became the engine of the Information Age , the primary way billions connect online. [32] [33] [34] [10]
Nomenclature
Tim Berners-Lee himself insists itâs “World Wide Web,” three separate words, no hyphens, all caps. [35] The “www” prefix, though, is fading. Weâre more likely to say “Gmail” or “Facebook” than “www.gmail.com .” As the mobile web exploded, [36] those prefixes became cumbersome. [37]
In English, itâs usually “double-u double-u double-u.” New Zealanders apparently prefer “dub-dub-dub.” [38] [39] And Douglas Adams once quipped, “The World Wide Web is the only thing I know of whose shortened form takes three times longer to say than what it’s short for.” [41] Wise words.
Function
⢠Main articles: HTTP and HTML
Think of the World Wide Web as an application layer protocol running on the Internet, making it actually useful. Mosaic was the spark that lit the fire, making the Web visually appealing with inline images and such.
People often conflate the Internet and the World Wide Web. Theyâre not the same. The Internet is the global network of computer networks , the physical infrastructure. The Web is the collection of documents and resources on that network, linked by URIs and accessed via HTTP or HTTPS. [3]
To view a web page , you either type its URL into a browser or click a hyperlink . The browser then embarks on a silent ballet of requests and responses to fetch and display the page. In the 90s, this process was dubbed “browsing” or “web surfing.” Early studies tried to categorize this new behavior, identifying patterns like “exploratory surfing” and “targeted navigation.” [42]
Letâs take http://example.org/home.html as an example. Your browser first needs to translate example.org into an Internet Protocol address
using the Domain Name System
(DNS). Say it gets 203.0.113.4. Then, it sends an HTTP
request across the Internet to that address, specifically to port 80 (or 443 for HTTPS). The request itself is simple:
| |
The server at example.org receives this on port 80 and, if it can fulfill the request, sends back a response:
| |
Followed by the actual HTML content of the page. A basic HTML page might look like this:
| |
Your browser then parses
this HTML, interpreting tags like <title> and <p> to format the text. Often, HTML pages reference other resources: images, scripts, Cascading Style Sheets
for layout. The browser makes more HTTP requests for these. As it receives them, it progressively renders
the page.
HTML
⢠Main article: HTML
Hypertext Markup Language (HTML) is the bedrock of web pages and web applications . Itâs the first part of the cornerstone trio: HTML, CSS, and JavaScript . [43]
Browsers fetch HTML from a web server or local storage and render it into what you see. HTML defines the page’s structure semantically and, originally, hinted at its appearance.
HTML elements
are the fundamental building blocks. They can embed images
, forms
, and more. HTML structures text with headings, paragraphs, lists, links
, and so on, using tags enclosed in angle brackets
. Tags like <img /> and <input /> insert content directly. Others, like <p>, wrap around text to define its meaning. Browsers don’t show the tags; they use them to interpret the page.
HTML can also embed scripting languages like JavaScript to control page behavior and content. Cascading Style Sheets dictate the look and layout. The World Wide Web Consortium (W3C) has been pushing CSS over explicit HTML styling since 1997. [44]
Linking
Most web pages are riddled with hyperlinks to other pages, files, definitionsâwhatever. In HTML, a link looks like this:
<a href="http://example.org/home.html">Example.org Homepage</a>
This interconnected web of information is, well, the World Wide Web. Tim Berners-Lee first called it the WorldWideWeb in November 1990, using CamelCase thatâs since been dropped. [45]
The structure of these links forms a webgraph , where nodes are pages and edges are links. But links break. Resources vanish. Content changes. This is “link rot” . Efforts like the Internet Archive , active since 1996, try to preserve this ephemeral landscape.
www prefix
Many web hostnames start with “www.” Itâs a convention, like naming an FTP server
“ftp” or a Usenet
server “news.” These are subdomains
within the Domain Name System
(DNS), like www.example.com. Itâs not a rule, though. The very first web server was nxoc01.cern.ch. [46] Apparently, the “www” subdomain was an accident; the project page was meant for www.cern.ch, but the DNS records were never swapped. [47] [better source needed] Many sites use www or www2, secure, or en for specific functions. Sometimes example.com and www.example.com point to the same place; sometimes they don’t. Using subdomains is useful for load balancing
by pointing a CNAME record
to a cluster of servers. You canât do that with the bare domain root. [48] [dubious â discuss]
Some browsers, if you type in an incomplete domain, will automatically try adding “www.” and common suffixes like “.com” or “.org.” Entering “microsoft” might become http://www.microsoft.com/. This feature appeared in early Firefox
versions. [49] [unreliable source?] Microsoft apparently patented something similar for mobile devices. [50]
Scheme specifiers
The http:// and https:// at the start of a URI
tell your browser which protocol to use: the standard Hypertext Transfer Protocol
or its encrypted sibling, HTTP Secure
. HTTPS is crucial for sensitive data like passwords and bank details. Browsers often assume http:// if you forget it. [citation needed]
Pages
⢠Main article: Web page
A web page is, quite simply, a document meant for the World Wide Web and its browsers. Itâs what you see on your monitor or mobile device .
“Web page” usually means whatâs visible, but it can also refer to the computer file itself, typically a text file with hypertext markup, usually HTML . Pages contain hyperlinks to other pages. Browsers often need to fetch multiple web resources âstyle sheets , scripts , imagesâto display a single page.
On a network, browsers retrieve pages from a remote web server . This server might be private, like on a corporate intranet . The browser uses HTTP to ask for these pages.
There are two main types: static and dynamic.
Static page
⢠Main article: Static web page
A static web pageâor flat page, stationary pageâis delivered exactly as it’s stored. No web application generating it on the fly. [120]
This means it shows the same thing to everyone, unless the server is configured to serve different language or content-type versions.
Dynamic pages
⢠Main articles: Dynamic web page and Ajax (programming)
A dynamic page is shaped by server-side scripts. PHP and MySQL are common tools here. Parameters dictate how the page is assembled for each request.
Client-side dynamic pages use JavaScript running in the browser to modify the page’s Document Object Model (DOM). This can happen without a full page reload, especially with Ajax techniques. Ajax allows a single page to update its content dynamically, making it feel more like an application than a static document. This often means your web browsing history isn’t updated in the traditional sense for those specific dynamic updates.
Dynamic HTML, or DHTML, was an umbrella term for these technologies, though AJAX has largely superseded it. Client-side scripts, server-side scripts, or both, create the dynamic experience. [citation needed]
JavaScript , created by Brendan Eich at Netscape in 1995, is the language of choice for interactivity. [51] The standardized version is ECMAScript . [51] Ajax, using asynchronous JavaScript and XML , allows pages to fetch data from the server without interrupting the user. Scripts can make HTTP requests, modify the page’s DOM, or even poll the server for updates. [52]
Website
The usap.gov website. Just⌠a website.
⢠Main article: Website
A website is basically a collection of related web pages and multimedia content, all tied together by a common domain name and hosted on at least one web server . Think wikipedia.org , google.com , amazon.com .
You access them via the public Internet Protocol (IP) network, the Internet , or a private local area network (LAN), using a uniform resource locator .
Websites serve countless purposes: personal pages, corporate sites, government portals, organization hubs. Theyâre usually focused on a specific topicâentertainment, social networking , news, education. All the publicly accessible ones make up the World Wide Web. Private ones, like a companyâs internal site, are part of an intranet .
Web pages are built with plain text and formatting instructions like HTML or XHTML . They can pull in elements from other sites. Accessing them involves HTTP , sometimes secured with HTTPS . Your web browser then interprets the markup and displays it.
Hyperlinking guides you through the site, often starting at the home page . Some sites require registration or subscription . Think news sites, academic journals, gaming platforms, message boards , webmail, social networking sites. You can access them on anything from a desktop computer to a smartphone .
Browser
⢠Main article: Web browser
A web browser, or just “browser,” is the software you use to navigate the Web. Itâs your user agent . You need one to connect to a server and display its pages. It handles downloading, formatting, and rendering.
Besides finding and displaying pages, browsers usually offer bookmarks, history, cookie management, and home pages. Some even remember your passwords.
The big players right now? Chrome , Safari , Edge , Samsung Internet , and Firefox . [54]
Server
⢠Main article: Web server
This is a Dell PowerEdge server. Looks important.
A Web server is the server software , or the hardware running it, that responds to World Wide Web client requests. It can host one or many websites. It speaks HTTP and related protocols.
Its main job is to store, process, and deliver web pages to clients . [55] Communication happens via HTTP. The delivered content is usually HTML documents , but can include images , style sheets , and scripts .
For high-traffic sites, multiple servers work together. These are Dell servers, powering the Wikimedia Foundation .
Your web browser or web crawler sends an HTTP request. The server responds with the requested content or an error message . The content is usually a file on the server’s secondary storage , but not always.
Servers can also receive data from clients, used for web forms and file uploads.
Many servers support scripting languages like ASP or PHP . This means the server’s behavior can be customized without changing the core software. This is often used to generate HTML documents dynamically on the fly, rather than serving static documents . Dynamic generation is slower but essential for personalized content, often pulling data from databases . Static is faster and easier to cache .
Servers are also embedded in smaller devicesâprinters , routers , webcams âoften serving only a local network for administration. This means you only need a browser, which is usually built into your operating system .
Optical Networking
Optical networking is the backbone of modern communication, using fiber optics to transmit data globally. It relies on complex components like tunable lasers , filters, and switches to manage these high-speed networks. [56] [57]
The vast network of optical fiber laid down in the late 20th century underpins the Internet. This “information highway” uses light pulses to carry data. [58]
Early iterations like the ARPANET , established in 1969, connected universities and researchers. [59] [60] [61] [62] But access was restricted. The National Science Foundation created the NSFNET in 1985 to provide broader supercomputer access. [62]
Public demand for Internet access grew, leading to pressure for privatization. In 1993, the National Information Infrastructure Act mandated the NSF to transfer control to commercial entities. [63] [64]
The privatization and the public release of the World Wide Web in 1993 triggered an explosion in demand. Developers scrambled to find ways to increase data transmission speeds and reduce costs, like improving fiber capacity. [65] [66] [67] [68]
In 1994, Pirelliâs optical division introduced a wavelength-division multiplexing (WDM) system, allowing more data to be sent simultaneously over a single fiber. [69] [70]
Ciena Corporation , founded by David Huber and Kevin Kimberlin in 1992, also developed WDM technology. Drawing on laser expertise, Ciena focused on optical amplifiers. [71] [72] [73] Their dual-stage amplifier for dense WDM (DWDM) was patented in 1997 and deployed by Sprint in 1996. [77] [78] [79] [80] [81]
Cookie
⢠Main article: HTTP cookie
An HTTP cookie, or web cookie, is a small piece of data sent from a website and stored by your browser. They were designed to remember things: items in a shopping cart, login status, browsing activity. They can even store information youâve typed into forms.
Cookies are essential for the modern web. Authentication cookies, for instance, tell servers if you’re logged in. Without them, sites couldn’t send you sensitive data or ask you to log in. Cookie security depends on the website and browser, and whether the data is encrypted. Weaknesses can lead to data theft or unauthorized access, as seen in cross-site scripting and cross-site request forgery attacks. [82]
Tracking cookies, especially third-party ones, raise privacy concerns by compiling long-term browsing histories. This prompted legislation in Europe and the US. [83] [84] [85] EU law requires informed consent for storing non-essential cookies.
Jann Horn of Google’s Project Zero highlighted how cookies can be intercepted by network intermediaries. He recommends using private browsing mode (like Incognito mode ) in such cases. [86]
Search engine
⢠Main article: Search engine
Results from a web image search engine for “lunar eclipse.” Fascinating.
A web search engine is a software system designed to systematically search the World Wide Web for information based on a web search query . Results are typically presented as a list, often called search engine results pages (SERPs). These can include web pages , images, videos, and more. Some engines also mine databases or web directories . Unlike human-edited directories, search engines use algorithms run by web crawlers to maintain real-time information. Content not indexed by these engines is the deep web .
Archie , the first search engine, launched in 1990. It indexed File Transfer Protocol (FTP) sites. [87] [88] More advanced engines like Yahoo! (1995) and Google (1998) followed. [89] [90]
Deep web
⢠Main article: Deep web
The deep web, invisible web, or hidden webâthese are the parts of the World Wide Web not indexed by standard search engines. [91] [92] [93] The opposite is the surface web . [94] Computer scientist Michael K. Bergman coined “deep web” in 2001. [95]
Deep web content is hidden behind HTTP forms [96] [97] and includes common services like web mail , online banking , and paid content behind paywalls , such as video on demand .
You can access deep web content with a direct URL or IP address , but it often requires a password or other authentication.
Caching
A web cache is a server that stores recently accessed web pages to speed up responses for repeated requests. Browsers also have browser caches for local storage. Browsers can request only changed data. Pages can specify expiration times to control caching, especially for sensitive data like in online banking . Website designers often group resources like CSS and JavaScript into single files for efficient caching. Enterprise firewalls can cache web resources for multiple users. Some search engines cache frequently accessed sites.
Security
The Web is a playground for criminals . Malware , identity theft , fraud , espionage âitâs all here. [98] Web-based vulnerabilities now surpass traditional computer security issues. [99] [100] Google estimates about 10% of web pages contain malicious code. [101] Most attacks happen on legitimate sites, often hosted in the US, China, and Russia, according to Sophos . [102] SQL injection is the most common malware threat. [103] HTML and URIs opened the door to attacks like cross-site scripting (XSS) with the rise of JavaScript, [104] further amplified by Web 2.0 and Ajax. [105] In 2007, an estimated 70% of websites were vulnerable to XSS. [106] Phishing is another major threat; global losses were estimated at $1.5 billion in 2012. [107] Covert and Open Redirects are common phishing tactics.
Solutions vary. Security companies like McAfee offer compliance tools. Finjan Holdings advocates real-time inspection of code. [98] Some see security as a business opportunity, not a cost. [109] Others call for “ubiquitous, always-on digital rights management ” infrastructure. [110] Jonathan Zittrain believes shared user responsibility for security is better than locking down the Internet. [111]
Privacy
⢠Main article: Internet privacy
Every time you request a page, the server sees your IP address . Servers log these. Browsers usually keep a history and cache content locally. If your connection isn’t encrypted with HTTPS, your requests travel in plain text, visible to anyone in between. A virtual private network (VPN) can encrypt traffic and mask your IP, reducing tracking.
When you supply personal informationâname, address, emailâwebsites can link your traffic to you. Using HTTP cookies , logins, or other trackers, they can connect current and past visits. This allows organizations to build detailed profilesâinterests, profession, demographics. Demographic profile . These profiles are valuable to marketers and advertisers. Depending on terms of service and local laws, this data might be sold or shared. For most, it means targeted ads or spam. For some, it can lead to unwanted attention based on niche interests. Law enforcement and intelligence agencies can also track individuals.
Social networking sites often encourage real names and details, believing it enhances engagement. However, uploaded photos or careless posts can be linked to individuals, sometimes with regret. Employers, schools, and family might see things you didn’t intend them to. Cyberbullies can exploit personal information for harassment or stalking . While privacy settings exist, they can be complex. Photos and videos pose particular issues, especially with advancing facial recognition technology . Removing content from the Web is nearly impossible due to caching and mirroring.
Standards
⢠Main article: Web standards
Web standards encompass specifications that govern the Internet and the Web, affecting interoperability , accessibility , and usability .
They include:
- Recommendations from the World Wide Web Consortium (W3C). [113]
- The “Living Standard” from the Web Hypertext Application Technology Working Group (WHATWG).
- Request for Comments (RFC) documents from the Internet Engineering Task Force (IETF). [114]
- Standards from the International Organization for Standardization (ISO). [115]
- Standards from Ecma International . [116]
- The Unicode Standard and reports from the Unicode Consortium . [117]
- Registries maintained by the Internet Assigned Numbers Authority (IANA). [118]
These standards evolve; they aren’t static. They’re developed by standards organizations , not single entities. Itâs important to distinguish draft specifications from finalized ones.
Accessibility
⢠Main article: Web accessibility
Alternative methods and formats exist to make the Web accessible to people with disabilities âvisual, auditory, physical, cognitive, etc. These features also help those with temporary impairments or ageing users. [120] The W3C stresses that universal access is crucial for equal opportunity. [121] Tim Berners-Lee himself said, “The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.” [120] Many countries mandate web accessibility. [122] The W3C’s Web Accessibility Initiative provides guidelines for creators and developers to ensure the Web is usable by all, including those using assistive technology . [120] [123]
Internationalisation
⢠Main article: Internationalisation and localization
The W3C’s Internationalisation Activity ensures web technologies work across languages, scripts, and cultures. [124] Unicode became dominant around 2007, surpassing ASCII . [125] While original RFC 3986 limited URIs to US-ASCII, RFC 3987 allows any Universal Character Set character, enabling IRIs in any language. [126]
See also
- Engineering portal
- Internet portal
- World portal
- Decentralized web
- Electronic publishing
- Electronic literature
- Gopher (protocol) , an early alternative to the WWW
- Internet metaphors
- Internet security
- Lists of websites
- Minitel , a predecessor of the WWW
- Streaming media
- Web 1.0
- Web 2.0
- Web 3.0
- Web3
- Web3D
- Web development tools
- Web literacy