World Wide Web

Related subjects: Websites and the Internet

Background Information

SOS Children volunteers helped choose articles and made other curriculum material Sponsor a child to make a real difference.

World Wide Web
The Web's logo designed by Robert Cailliau
Inventor	Tim Berners-Lee Robert Cailliau
Company	CERN
Availability	Worldwide

Internet
An Opte Project visualization of routing paths through a portion of the Internet.
General Access Censorship Democracy Digital divide Digital rights Freedom of information History of the Internet Internet phenomena Net neutrality Pioneers Privacy Sociology Usage
Governance ICANN Internet Engineering Task Force Internet Governance Forum Internet Society
Information infrastructure Domain Name System Hypertext Transfer Protocol Internet exchange point Internet Protocol Internet protocol suite Internet service provider IP address POP3 email protocol Simple Mail Transfer Protocol
Services Blogs Microblogging Email Fax File sharing File transfer Games Instant messaging Podcasts Shopping Television Voice over IP World Wide Web search
Guides Book Index Outline
Internet portal

The World Wide Web (abbreviated as WWW or W3, commonly known as the Web), is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia, and navigate between them via hyperlinks.

Using concepts from his earlier hypertext systems like ENQUIRE, British engineer, computer scientist and at that time employee of CERN, Sir Tim Berners-Lee, now Director of the World Wide Web Consortium (W3C), wrote a proposal in March 1989 for what would eventually become the World Wide Web. At CERN, a European research organisation near Geneva situated on Swiss and French soil, Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext "to link and access information of various kinds as a web of nodes in which the user can browse at will", and they publicly introduced the project in December of the same year.

History

The NeXT Computer used by Berners-Lee. The handwritten label declares, "This machine is a server. DO NOT POWER IT DOWN!!"

In the May 1970 issue of Popular Science magazine, Arthur C. Clarke predicted that satellites would someday "bring the accumulated knowledge of the world to your fingertips" using a console that would combine the functionality of the photocopier, telephone, television and a small computer, allowing data transfer and video conferencing around the globe.

In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and software project he had built in 1980, and described a more elaborate information management system.

With help from Robert Cailliau, he published a more formal proposal (on 12 November 1990) to build a "Hypertext project" called "WorldWideWeb" (one word, also "W3") as a "web" of "hypertext documents" to be viewed by " browsers" using a client–server architecture. This proposal estimated that a read-only web would be developed within three months and that it would take six months to achieve "the creation of new links and new material by readers, [so that] authorship becomes universal" as well as "the automatic notification of a reader when new material of interest to him/her has become available." While the read-only goal was met, accessible authorship of web content took longer to mature, with the wiki concept, blogs, Web 2.0 and RSS/ Atom.

The proposal was modeled after the Dynatext SGML reader by Electronic Book Technology, a spin-off from the Institute for Research in Information and Scholarship at Brown University. The Dynatext system, licensed by CERN, was technically advanced and was a key player in the extension of SGML ISO 8879:1986 to Hypermedia within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in the general high energy physics community, namely a fee for each document and each document alteration.

The CERN datacenter in 2010 housing some WWW servers

A NeXT Computer was used by Berners-Lee as the world's first web server and also to write the first web browser, WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the first web browser (which was a web editor as well); the first web server; and the first web pages, which described the project itself. On 6 August 1991, he posted a short summary of the World Wide Web project on the alt.hypertext newsgroup. This date also marked the debut of the Web as a publicly available service on the Internet. Many newsmedia have reported that the first photo on the web was uploaded by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettes taken by Silvano de Gennaro; Gennaro has disclaimed this story, writing that media were "totally distorting our words for the sake of cheap sensationalism." The first server outside Europe was set up at the Stanford Linear Accelerator Centre (SLAC) in Palo Alto, California, to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event. The World Wide Web Consortium says December 1992, whereas SLAC itself claims 1991. This is supported by a W3C document titled A Little History of the World Wide Web.

The crucial underlying concept of hypertext originated with older projects from the 1960s, such as the Hypertext Editing System (HES) at Brown University, Ted Nelson's Project Xanadu, and Douglas Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush's microfilm-based " memex", which was described in the 1945 essay " As We May Think".

Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally tackled the project himself. In the process, he developed three essential technologies:

a system of globally unique identifiers for resources on the Web and elsewhere, the universal document identifier (UDI), later known as uniform resource locator (URL) and uniform resource identifier (URI);
the publishing language HyperText Markup Language (HTML);
the Hypertext Transfer Protocol (HTTP).

The World Wide Web had a number of differences from other hypertext systems that were then available. The Web required only unidirectional links rather than bidirectional ones. This made it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to anyone, with no fees due. Coming two months after the announcement that the server implementation of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System.

Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th anniversary of the World Wide Web Consortium.

Scholars generally agree that a turning point for the World Wide Web began with the introduction of the Mosaic web browser in 1993, a graphical browser developed by a team at the National Centre for Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen. Funding for Mosaic came from the U.S. High-Performance Computing and Communications Initiative and the High Performance Computing and Communication Act of 1991, one of several computing developments initiated by U.S. Senator Al Gore. Prior to the release of Mosaic, graphics were not commonly mixed with text in web pages and the Web's popularity was less than older protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic's graphical user interface allowed the Web to become, by far, the most popular Internet protocol.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European Organization for Nuclear Research (CERN) in October 1994. It was founded at the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second site was founded at INRIA (a French national computer research lab) with support from the European Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By the end of 1994, while the total number of websites was still minute compared to present standards, quite a number of notable websites were already active, many of which are the precursors or inspiration for today's most popular services.

Connected by the existing Internet, other websites were created around the world, adding international standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the development of web standards (such as the markup languages in which web pages are composed), and in recent years has advocated his vision of a Semantic Web. The World Wide Web enabled the spread of information over the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing use of the Internet. Although the two terms are sometimes conflated in popular use, World Wide Web is not synonymous with Internet. The Web is a collection of documents and both client and server software using Internet protocols such as TCP/IP and HTTP. Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for his contribution to the World Wide Web.

Function

The terms Internet and World Wide Web are often used in everyday speech without much distinction. However, the Internet and the World Wide Web are not the same. The Internet is a global system of interconnected computer networks. In contrast, the Web is one of the services that runs on the Internet. It is a collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed by web browsers from web servers. In short, the Web can be thought of as an application "running" on the Internet.

Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into a web browser or by following a hyperlink to that page or resource. The web browser then initiates a series of communication messages, behind the scenes, in order to fetch and display it. As an example, consider accessing a page with the URL http://example.org/wiki/World_Wide_Web.

First, the browser resolves the server-name portion of the URL (example.org) into an Internet Protocol address using the globally distributed database known as the Domain Name System (DNS); this lookup returns an IP address such as 208.80.152.2. The browser then requests the resource by sending an HTTP request across the Internet to the computer at that particular address. It makes the request to a particular application port in the underlying Internet Protocol Suite so that the computer receiving the request can distinguish an HTTP request from other network protocols it may be servicing such as e-mail delivery; the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple as the two lines of text

GET /wiki/World_Wide_Web HTTP/1.1
Host: example.org

The computer receiving the HTTP request delivers it to web server software listening for requests on port 80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating success, which can be as simple as

HTTP/1.0 200 OK
Content-Type: text/html; charset=UTF-8

followed by the content of the requested page. The Hypertext Markup Language for a basic web page looks like

Wikipedia for Schools is also available in Spanish, French and Portuguese.

Download the complete archives here.

Example.org – The World Wide Web

The World Wide Web, abbreviated as WWW and commonly known ...

The web browser parses the HTML, interpreting the markup (</tt>, <tt><p></tt> for paragraph, and such) that surrounds the words in order to draw the text on the screen.</p> <p>Many web pages use HTML to reference the URLs of other resources such as images, other embedded media,  scripts that affect page behaviour, and  Cascading Style Sheets that affect page layout. The browser will make additional HTTP requests to the web server for these other  Internet media types. As it receives their content from the web server, the browser progressively  renders the page onto the screen as specified by its HTML and these additional resources.</p> <h3> <span class="mw-headline" id="Linking">Linking</span></h3> <p>Most web pages contain  hyperlinks to other related pages and perhaps to downloadable files, source documents, definitions and other web resources. In the underlying HTML, a hyperlink looks like</p><pre>  <i>Example.org, a free encyclopedia</i> </pre><div class="thumb tright"> <div class="thumbinner" style="width:222px;"><a class="image" href="../../images/1787/178711.png.htm"><img alt="" class="thumbimage" height="158" src="../../images/1787/178711.png" width="220" /></a><div class="thumbcaption"> <div class="magnify"><a class="internal" href="../../images/1787/178711.png.htm" title="Enlarge"><img alt="" height="11" src="../../images/1x1white.gif" title="This image is not present because of licensing restrictions" width="15" /></a></div> Graphic representation of a minute fraction of the WWW, demonstrating  hyperlinks</div> </div> </div> <p>Such a collection of useful, related resources, interconnected via hypertext links is dubbed a <i>web</i> of information. Publication on the Internet created what <a href="../../wp/t/Tim_Berners-Lee.htm" title="Tim Berners-Lee">Tim Berners-Lee</a> first called the <i>WorldWideWeb</i> (in its original  CamelCase, which was subsequently discarded) in November 1990.</p> <p>The hyperlink structure of the WWW is described by the  webgraph: the nodes of the  webgraph correspond to the web pages (or URLs) the directed edges between them to the  hyperlinks.</p> <p>Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as  link rot and the hyperlinks affected by it are often called  dead links. The ephemeral nature of the Web has prompted many efforts to archive web sites. The  Internet Archive, active since 1996, is the best known of such efforts.</p> <h3> <span class="mw-headline" id="Dynamic_updates_of_web_pages">Dynamic updates of web pages</span></h3> <p> JavaScript is a  scripting language that was initially developed in 1995 by  Brendan Eich, then of  Netscape, for use within web pages. The standardised version is  ECMAScript. To make web pages more interactive, some web applications also use JavaScript techniques such as  Ajax ( asynchronous JavaScript and  XML).  Client-side script is delivered with the page that can make additional HTTP requests to the server, either in response to user actions such as mouse movements or clicks, or based on lapsed time. The server's responses are used to modify the current page rather than creating a new page with each response, so the server needs only to provide limited, incremental information. Multiple Ajax requests can be handled at the same time, and users can interact with the page while data is being retrieved. Web pages may also regularly  poll the server to check whether new information is available.</p> <h3> <span class="mw-headline" id="WWW_prefix">WWW prefix</span></h3> <p>Many domain names used for the World Wide Web begin with <i>www</i> because of the long-standing practice of naming Internet hosts (servers) according to the services they provide. The  hostname for a  web server is often <i>www</i>, in the same way that it may be <i>ftp</i> for an  FTP server, and <i>news</i> or <i>nntp</i> for a  USENET  news server. These host names appear as  Domain Name System or [domain name server](DNS)  subdomain names, as in <tt>www.example.com</tt>. The use of 'www' as a subdomain name is not required by any technical or policy standard and many web sites do not use it; indeed, the first ever web server was called <tt>nxoc01.cern.ch</tt>. According to Paolo Palazzi, who worked at CERN along with Tim Berners-Lee, the popular use of 'www' subdomain was accidental; the World Wide Web project page was intended to be published at www.cern.ch while info.cern.ch was intended to be the CERN home page, however the dns records were never switched, and the practice of prepending 'www' to an institution's website domain name was subsequently copied. Many established websites still use 'www', or they invent other subdomain names such as 'www2', 'secure', etc. Many such web servers are set up so that both the domain root (e.g., example.com) and the <i>www</i> subdomain (e.g., www.example.com) refer to the same site; others require one form or the other, or they may map to different web sites.</p> <p>The use of a subdomain name is useful for  load balancing incoming web traffic by creating a  CNAME record that points to a cluster of web servers. Since, currently, only a subdomain can be used in a CNAME, the same result cannot be achieved by using the bare domain root.</p> <p>When a user submits an incomplete domain name to a web browser in its address bar input field, some web browsers automatically try adding the prefix "www" to the beginning of it and possibly ".com", ".org" and ".net" at the end, depending on what might be missing. For example, entering 'microsoft' may be transformed to <i>http://www.microsoft.com/</i> and 'openoffice' to <i>http://www.openoffice.org</i>. This feature started appearing in early versions of Mozilla <a href="../../wp/f/Firefox.htm" title="Firefox">Firefox</a>, when it still had the working title 'Firebird' in early 2003, from an earlier practice in browsers such as  Lynx. It is reported that Microsoft was granted a US patent for the same idea in 2008, but only for mobile devices.</p> <p>In English,  <i>www</i> is usually read as <i>double-u double-u double-u</i>. Some users pronounce it <i>dub-dub-dub</i>, particularly in New Zealand. Stephen Fry, in his "Podgrammes" series of podcasts, pronouncing it <i>wuh wuh wuh</i>. The English writer <a href="../../wp/d/Douglas_Adams.htm" title="Douglas Adams">Douglas Adams</a> once quipped in  The Independent on Sunday (1999): "The World Wide Web is the only thing I know of whose shortened form takes three times longer to say than what it's short for". In Mandarin Chinese, <i>World Wide Web</i> is commonly translated via a  phono-semantic matching to <i>wàn wéi wǎng</i> (<span lang="zh" xml:lang="zh"> 万维网</span>), which satisfies <i>www</i> and literally means "myriad dimensional net", a translation that very appropriately reflects the design concept and proliferation of the World Wide Web. Tim Berners-Lee's web-space states that <i>World Wide Web</i> is officially spelled as three separate words, each capitalised, with no intervening hyphens.</p> <p>Use of the www prefix is declining as  Web 2.0  web applications seek to brand their domain names and make them easily pronounceable. As the  mobile web grows in popularity, services like <a class="mw-redirect" href="../../wp/g/Gmail.htm" title="Google Mail">Gmail</a>.com, MySpace.com, Facebook.com,  Bebo.com and Twitter.com are most often discussed without adding www to the domain (or, indeed, the .com).</p> <h3> <span class="mw-headline" id="Scheme_specifiers:_http_and_https">Scheme specifiers: http and https</span></h3> <p>The scheme specifier <i>http://</i> or <i>https://</i> at the start of a Web  URI refers to  Hypertext Transfer Protocol or  HTTP Secure respectively. Unlike <i>www</i>, which has no specific purpose, these specify the communication protocol to be used for the request and response. The HTTP protocol is fundamental to the operation of the World Wide Web and the added encryption layer in HTTPS is essential when confidential information such as passwords or banking information are to be exchanged over the public Internet. Web browsers usually prepend http:// to addresses too, if omitted.</p> <h2> <span class="mw-headline" id="Web_servers">Web servers</span></h2> <p>The primary function of a web server is to deliver web pages on the request to clients. This means delivery of HTML documents and any additional content that may be included by a document, such as images, style sheets and scripts.</p> <h2> <span class="mw-headline" id="Privacy">Privacy</span></h2> <p>Every time a web page is requested from a web server the server can identify, and usually it logs, the IP address from which the request arrived. Equally, unless set not to do so, most web browsers record the web pages that have been requested and viewed in a <i>history</i> feature, and usually  cache much of the content locally. Unless HTTPS encryption is used, web requests and responses travel in plain text across the internet and they can be viewed, recorded and cached by intermediate systems.</p> <p>When a web page asks for, and the user supplies,  personally identifiable information such as their real name, address, e-mail address, etc., then a connection can be made between the current web traffic and that individual. If the website uses <a href="../../wp/h/HTTP_cookie.htm" title="HTTP cookie">HTTP cookies</a>, username and password authentication, or other tracking techniques, then it will be able to relate other web visits, before and after, to the identifiable information provided. In this way it is possible for a web-based organisation to develop and build a profile of the individual people who use its site or sites. It may be able to build a record for an individual that includes information about their leisure activities, their shopping interests, their profession, and other aspects of their  demographic profile. These profiles are obviously of potential interest to marketeers, advertisers and others. Depending on the website's  terms and conditions and the local laws that apply information from these profiles may be sold, shared, or passed to other organisations without the user being informed. For many ordinary people, this means little more than some unexpected e-mails in their in-box, or some uncannily relevant advertising on a future web page. For others, it can mean that time spent indulging an unusual interest can result in a deluge of further targeted marketing that may be unwelcome. Law enforcement, counter terrorism and espionage agencies can also identify, target and track individuals based on what appear to be their interests or proclivities on the web.</p> <p> Social networking sites make a point of trying to get the user to truthfully expose their real names, interests and locations. This makes the social networking experience more realistic and therefore engaging for all their users. On the other hand, photographs uploaded and unguarded statements made will be identified to the individual, who may regret some decisions to publish these data. Employers, schools, parents and other relatives may be influenced by aspects of social networking profiles that the posting individual did not intend for these audiences.  On-line bullies may make use of personal information to harass or  stalk users. Modern social networking websites allow fine grained control of the privacy settings for each individual posting, but these can be complex and not easy to find or use, especially for beginners.</p> <p>Photographs and videos posted onto websites have caused particular problems, as they can add a person's face to an on-line profile. With modern and potential  facial recognition technology, it may then be possible to relate that face with other, previously anonymous, images, events and scenarios that have been imaged elsewhere. Because of image caching, mirroring and copying, it is difficult to remove an image from the World Wide Web.</p> <h2> <span class="mw-headline" id="Intellectual_property">Intellectual property</span></h2> <p>The intellectual property rights for any creative work initially rests with its creator. Web users who want to publish their work onto the World Wide Web, however, need to be aware of the details of the way they do it. If artwork, photographs, writings, poems, or technical innovations are published by their creator onto a privately owned web server, then they may choose the  copyright and other conditions freely themselves. This is unusual though; more commonly work is uploaded to web sites and servers that are owned by other organizations. It depends upon the terms and conditions of the site or service provider to what extent the original owner automatically signs over rights to their work by the choice of destination and by the act of uploading.</p> <p>Many users of the web erroneously assume that everything they may find on line is freely available to them as if it was in the  public domain. This is almost never the case, unless the web site publishing the work clearly states that it is. On the other hand, content owners are aware of this widespread belief, and expect that sooner or later almost everything that is published will probably be used in some capacity somewhere without their permission. Many publishers therefore embed visible or invisible  digital watermarks in their media files, sometimes charging users to receive unmarked copies for legitimate use.  Digital rights management includes forms of access control technology that further limit the use of digital content even after it has been bought or downloaded.</p> <h2> <span class="mw-headline" id="Security">Security</span></h2> <p>The Web has become criminals' preferred pathway for spreading  malware. Cybercrime carried out on the Web can include  identity theft, fraud, espionage and  intelligence gathering. Web-based  vulnerabilities now outnumber traditional computer security concerns, and as measured by <a href="../../wp/g/Google.htm" title="Google">Google</a>, about one in ten web pages may contain malicious code. Most Web-based  attacks take place on legitimate websites, and most, as measured by  Sophos, are hosted in the United States, China and Russia. The most common of all malware  threats is  SQL injection attacks against websites. Through HTML and URIs the Web was vulnerable to attacks like  cross-site scripting (XSS) that came with the introduction of JavaScript and were exacerbated to some degree by Web 2.0 and Ajax  web design that favors the use of scripts. Today by one estimate, 70% of all websites are open to XSS attacks on their users.</p> <p>Proposed solutions vary to extremes. Large security vendors like  McAfee already design governance and compliance suites to meet post-9/11 regulations, and some, like  Finjan have recommended active real-time inspection of code and all content regardless of its source. Some have argued that for enterprise to see security as a business opportunity rather than a cost centre, "ubiquitous, always-on digital rights management" enforced in the infrastructure by a handful of organizations must replace the hundreds of companies that today secure data and networks.  Jonathan Zittrain has said users sharing responsibility for computing safety is far preferable to locking down the Internet.</p> <h2> <span class="mw-headline" id="Standards">Standards</span></h2> <p>Many formal standards and other technical specifications and software define the operation of different aspects of the World Wide Web, the Internet, and computer information exchange. Many of the documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some are produced by the  Internet Engineering Task Force (IETF) and other organizations.</p> <p>Usually, when web standards are discussed, the following publications are seen as foundational:</p> <ul> <li>Recommendations for <a class="mw-redirect" href="../../wp/m/Markup_language.htm" title="Markup languages">markup languages</a>, especially  HTML and  XHTML, from the W3C. These define the structure and interpretation of  hypertext documents.</li> <li>Recommendations for  stylesheets, especially  CSS, from the W3C.</li> <li>Standards for  ECMAScript (usually in the form of  JavaScript), from  Ecma International.</li> <li>Recommendations for the  Document Object Model, from W3C.</li> </ul> <p>Additional publications provide definitions of other essential technologies for the World Wide Web, including, but not limited to, the following:</p> <ul> <li><i>Uniform Resource Identifier</i> ( URI), which is a universal system for referencing resources on the Internet, such as hypertext documents and images. URIs, often called URLs, are defined by the IETF's  RFC 3986 / STD 66: <i>Uniform Resource Identifier (URI): Generic Syntax</i>, as well as its predecessors and numerous  URI scheme-defining  RFCs;</li> <li><i>HyperText Transfer Protocol (HTTP)</i>, especially as defined by  RFC 2616: <i>HTTP/1.1</i> and  RFC 2617: <i>HTTP Authentication</i>, which specify how the browser and server authenticate each other.</li> </ul> <h2> <span class="mw-headline" id="Accessibility">Accessibility</span></h2> <p>There are methods available for accessing the web in alternative mediums and formats, so as to enable use by individuals with disabilities. These disabilities may be visual, auditory, physical, speech related, cognitive, neurological, or some combination therin. Accessibility features also help others with temporary disabilities like a broken arm or the aging population as their abilities change. The Web is used for receiving information as well as providing information and interacting with society. The  World Wide Web Consortium claims it essential that the Web be accessible in order to provide equal access and  equal opportunity to people with disabilities. Tim Berners-Lee once noted, "The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect." Many countries regulate  web accessibility as a requirement for websites. International cooperation in the W3C  Web Accessibility Initiative led to simple guidelines that web content authors as well as software developers can use to make the Web accessible to persons who may or may not be using  assistive technology.</p> <h2> <span class="mw-headline" id="Internationalization">Internationalization</span></h2> <p>The W3C  Internationalization Activity assures that web technology will work in all languages, scripts, and cultures. Beginning in 2004 or 2005,  Unicode gained ground and eventually in December 2007 surpassed both <a href="../../wp/a/ASCII.htm" title="ASCII">ASCII</a> and Western European as the Web's most frequently used  character encoding. Originally  RFC 3986 allowed resources to be identified by  URI in a subset of US-ASCII.  RFC 3987 allows more characters—any character in the  Universal Character Set—and now a resource can be identified by  IRI in any language.</p> <h2> <span class="mw-headline" id="Statistics">Statistics</span></h2> <p>Between 2005 and 2010, the number of Web users doubled, and was expected to surpass two billion in 2010. Early studies in 1998 and 1999 estimating the size of the web using capture/recapture methods showed that much of the web was not indexed by search engines and the web was much larger than expected. According to a 2001 study, there were a massive number, over 550 billion, of documents on the Web, mostly in the invisible Web, or  Deep Web. A 2002 survey of 2,024 million Web pages determined that by far the most Web content was in the English language: 56.4%; next were pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used Web searches in 75 different languages to sample the Web, determined that there were over 11.5 billion Web pages in the  publicly indexable Web as of the end of January 2005. As of March 2009, the indexable web contains at least 25.21 billion pages. On 25 July 2008, Google software engineers Jesse Alpert and Nissan Hajaj announced that  Google Search had discovered one trillion unique URLs. As of May 2009, over 109.5 million domains operated. Of these 74% were commercial or other sites operating in the <code>.com</code>  generic top-level domain.</p> <p>Statistics measuring a website's popularity are usually based either on the number of  page views or on associated server ' hits' (file requests) that it receives.</p> <h2> <span class="mw-headline" id="Speed_issues">Speed issues</span></h2> <p>Frustration over  congestion issues in the Internet infrastructure and the high  latency that results in slow browsing has led to a pejorative name for the World Wide Web: the <i>World Wide Wait</i>. Speeding up the Internet is an ongoing discussion over the use of  peering and  QoS technologies. Other solutions to reduce the congestion can be found at  W3C.  Guidelines for Web response times are:</p> <ul> <li>0.1 second (one tenth of a second). Ideal response time. The user does not sense any interruption.</li> <li>1 second. Highest acceptable response time. Download times above 1 second interrupt the user experience.</li> <li>10 seconds. Unacceptable response time. The user experience is interrupted and the user is likely to leave the site or system.</li> </ul> <h2> <span class="mw-headline" id="Caching">Caching</span></h2> <p>If a user revisits a Web page after only a short interval, the page data may not need to be re-obtained from the source Web server. Almost all web browsers  cache recently obtained data, usually on the local hard drive. HTTP requests sent by a browser will usually ask only for data that has changed since the last download. If the locally cached data are still current, they will be reused. Caching helps reduce the amount of Web traffic on the Internet. The decision about expiration is made independently for each downloaded file, whether image,  stylesheet,  JavaScript, HTML, or other  web resource. Thus even on sites with highly dynamic content, many of the basic resources need to be refreshed only occasionally. Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few site-wide files so that they can be cached efficiently. This helps reduce page download times and lowers demands on the Web server.</p> <p>There are other components of the Internet that can cache Web content. Corporate and academic  firewalls often cache Web resources requested by one user for the benefit of all. (See also  caching proxy server.) Some  search engines also store cached content from websites. Apart from the facilities built into Web servers that can determine when files have been updated and so need to be re-sent, designers of dynamically generated Web pages can control the HTTP headers sent back to requesting users, so that transient or sensitive pages are not cached.  Internet banking and news sites frequently use this facility. Data requested with an  HTTP 'GET' is likely to be cached if other conditions are met; data obtained in response to a 'POST' is assumed to depend on the data that was POSTed and so is not cached.</p> </div> <div class="printfooter"> Retrieved from " http://en.wikipedia.org/w/index.php?title=World_Wide_Web&oldid=544124550"</div>  <div class="visualClear"> </div> </div> </div> </div> <div id="column-one"> <div id="logo"><a href="../../index.htm"><img src="../../schools-wikipedia-logo.png" alt="Wikipedia for Schools" /></a></div> <div class="menu"> <p class="sosheading"><a href="../../wp/index/subject.htm">Subjects</a></p> <p><a href="../../wp/index/subject.Art.htm">Art</a></p> <p><a href="../../wp/index/subject.Business_Studies.htm">Business Studies</a></p> <p><a href="../../wp/index/subject.Citizenship.htm">Citizenship</a></p> <p><a href="../../wp/index/subject.Countries.htm">Countries</a></p> <p><a href="../../wp/index/subject.Design_and_Technology.htm">Design and Technology</a></p> <p><a href="../../wp/index/subject.Everyday_life.htm">Everyday life</a></p> <p><a href="../../wp/index/subject.Geography.htm">Geography</a></p> <p><a href="../../wp/index/subject.History.htm">History</a></p> <p><a href="../../wp/index/subject.IT.htm">Information Technology</a></p> <p><a href="../../wp/index/subject.Language_and_literature.htm">Language and literature</a></p> <p><a href="../../wp/index/subject.Mathematics.htm">Mathematics</a></p> <p><a href="../../wp/index/subject.Music.htm">Music</a></p> <p><a href="../../wp/index/subject.People.htm">People</a></p> <p><a href="../../wp/index/subject.Portals.htm">Portals</a></p> <p><a href="../../wp/index/subject.Religion.htm">Religion</a></p> <p><a href="../../wp/index/subject.Science.htm">Science</a></p> <p style="margin-top: 10px;" class="sosheading"><a rel="nofollow" href="../../wp/index/alpha.htm">Title Word Index</a></p> </div> </div> <div class="visualClear"> </div> <div id="sosebar"> <div class="center"> Wikipedia for Schools is a selection taken from the original English-language Wikipedia by the child sponsorship charity <a rel="author" href="../../wp/s/Soschildrensvillages.htm">SOS Children</a>. It was created as a <a href="../../wp/w/Wikipedia_For_Schools.htm">checked and child-friendly teaching resource</a> for use in schools in the developing world and beyond.Sources and authors can be found at www.wikipedia.org. See also our <a href="../../disclaimer.htm"><strong>Disclaimer</strong></a>. These articles are available under the <a href="../../wp/w/Wikipedia%253AText_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License.htm"> Creative Commons Attribution Share-Alike Version 3.0 Unported Licence</a>. This article was sourced from http://en.wikipedia.org/?oldid=544124550 . </div> </div> </div> </body> </html>