1.46k likes | 1.99k Vues
INFO 321 Server Technologies II. 1. Apache. Apache is synonymous with a web server app, but the Apache HTTP Server is just one project of the ten-year-old Apache Software Foundation (ASF) There are dozens of Foundation projects
 
                
                E N D
INFO 321Server Technologies II Weeks 5-6 1
Apache Apache is synonymous with a web server app, but the Apache HTTP Server is just one project of the ten-year-old Apache Software Foundation (ASF) There are dozens of Foundation projects They state “We consider ourselves not simply a group of projects sharing a server, but rather a community of developers and users.” Material from http://httpd.apache.org/ and notes by Dr. Randy Kaplan Weeks 5-6 2
Overview This set of notes is divided into these sections Web Server functionality Choosing a web server Installing Apache Running Apache Virtual Hosting Authentication Indexing Alias and Redirect Proxying
Web Server functionality Weeks 5-6 4
Web Server protocols The main purpose of a web server is to handle HTTP and related protocols DNS FTP HTTPS Gopher, Telnet, etc. are also possible For more info on these protocols, see the chapter 2 notes for INFO 330 Weeks 5-6 5
Web Server protocols DNS uses UDP as its transport layer protocol Connectionless, unreliable The other protocols use TCP for transport Connection oriented between host computers Reliable All protocols work by passing text messages back and forth Weeks 5-6 6
Web Server Wish List Run fast Handle lots of requests with minimal hardware Support multitasking Deal with more than one request at a time Need to maintain workload without shutting the server down Authenticate requestors Weeks 5-6 7
Web Server Wish List Respond to errors in the messages it gets, and tell what is going on Negotiate a style and language of response with the requestor Support a variety of formats Run as a proxy server Be secure Weeks 5-6 8
What Does a Web Server Do? Translate a URL into a file name or a program name If a file – return the file over the Internet If a program – run the program, and send the output back over the Internet URL = Uniform Resource Locator Has three parts –<scheme>://<host>/<path> Weeks 5-6 9
How Does Apache Work? Runs under a suitable multitasking operating system Binary is called httpd under Unix Binary is called apache.exe under Win32 Each copy of httpd or apache.exe has its attention directed at a web site For our purposes, the web site is a directory Weeks 5-6 10
Apache and TCP/IP A computer has a connection to the outside world, called an interface Identify interface by a socket or port number The server decides how to handle different requests because the four byte (32 b) IPv4 address that leads the request to its interface is followed by a two byte (16 b) port number Weeks 5-6 11
Apache and TCP/IP Requests arrive on an interface for a number of different services offered by the server using different protocols Network News Transfer Protocol (NNTP) Simple Mail Transfer Protocol (SMTP) Domain Name Service (DNS) HTTP (WWW) Weeks 5-6 12
Apache and TCP/IP Different services attach to different ports NNTP: port number 119 SMTP: port number 25 DNS: port number 53 HTTP: port number 80 Weeks 5-6 13
Apache and TCP/IP UNIX/Linux Port numbers below 1024 can only be used by the superuser (root) Prevents other users from running programs masquerading as standard services Win32 Under Win32 there is currently no security directly related to port numbers and no superuser Weeks 5-6 14
How Does Apache Work? Idling state – Listens to the IP addresses specified in its config files (important foreshadowing…) When a request appears – Apache receives it and analyzes the headers Applies the rules in the config file Takes the appropriate action Weeks 5-6 15
How HTTP Clients Work When a URL (beginning http://) is sent to a browser, The browser reads ‘http:’ and determines it should be using the HTTP protocol to communicate with web servers A name server (DNS) is contacted to translate the host name in a URL to an IP address Weeks 5-6 16
Apache and Domain Servers It is the role of the DNS (Domain Name Server) to translate a computer’s telephone number (IP address) into a human readable (and memorable) name Weeks 5-6 17
DNS Errors Suppose Apache is given a URL which does not have a trailing / Apache will add a trailing / and try to access the URL again (called redirection) Then use DNS to resolve the IP address Weeks 5-6 18
Handling Multiple Web Sites The utility ifconfig binds IP addresses to physical interfaces (e.g. Ethernet ports) ifconfig also allows binding multiple IP addresses to a single interface A client can switch from one IP address to another while maintaining service This is known as IP Aliasing Weeks 5-6 19
Choosing a web server Weeks 5-6 20
Why choose Apache? Apache has been the dominant web server app since 1996 Open source enables its source code to be examined by thousands of eyes Substantially more reliable Apache is extensible Apache is freeware Weeks 5-6 21
Other choices Other web server apps include Microsoft IIS or PWS Google GWS Lighttpd Zeus ZWS nginx Sun (includes Netscape and Netsite variants) Weeks 5-6 22
Apache market share Apache has been the leading web server since March 1996, but is losing ground According to Netcraft surveys In November 2005, Apache supported 71 percent of domains, more than 50% ahead of Microsoft IIS (20.2 percent) (N=74.6 million) By June 2009, Apache had 47.12%, versus Windows (IIS and PWS) had 24.80% of the 238 million domains reporting Weeks 5-6 23
Apache as in Indian? “The name 'Apache' was chosen from respect for the Native American Indian tribe of Apache (Indé), well-known for their superior skills in warfare strategy and their inexhaustible endurance.” (Apache FAQ) Weeks 5-6 24
Apache version & platforms Apache is on version 2.2.17 (released Oct 19, 2010) and changes slowly Most Linux distributions are a little behind the current release Old releases (2.0.x and 1.3.x) are maintained Apache runs on 32-bit Windows flavors, UNIX/Linux, and even NetWare (!) Weeks 5-6 25
Installing Apache Weeks 5-6 26
Apache prereqs To install Apache, you need: An Internet connection helps Disk space – 50 MB to install, about 10 MB to run, depending on options An ANSI-C compiler, such as the GNU C compiler (GCC) from the Free Software Foundation (FSF) The Windows version can obtained in .exe form Weeks 5-6 27
Apache prereqs Accurate time keeping such as the ntpdate or xntpd programs Some parts of HTTP are based on time of day, so some form of NTP support is needed Perl5 is needed for a few options The utilities apr and apr-utilneed to be version 1.2 Upgrade them separately if needed, but they are included with Apache source code Weeks 5-6 28
Overview – Apache install Download $ lynx http://httpd.apache.org/download.cgi Extract $ gzip -d httpd-NN.tar.gz $ tar xvf httpd-NN.tar $ cd httpd-NN Configure $ ./configure --prefix=PREFIX Weeks 5-6 29
Overview – Apache install Compile $ make Install $ make install Customize $ vi PREFIX/conf/httpd.conf Test $ PREFIX/bin/apachectl -k start Weeks 5-6 30
Overview – Apache install NN must be replaced with the current version number (e.g. 2.2.17) PREFIX must be replaced with the file system path under which the server should be installed If PREFIX is not specified, it defaults to /usr/local/apache2 Weeks 5-6 31
Download Most UNIX/Linux users will want to download Apache and compile it locally After download, use PGP to verify the download’s integrity, e.g. % pgp -ka KEYS % pgp apache_1.3.24.tar.gz.asc This verifies against the MD5 or PGP message digest ASCII file Weeks 5-6 32
Extract This set of steps decompresses the tarball, extracts the tarball, and changes to the source code directory $ gzip -d httpd-NN.tar.gz $ tar xvf httpd-NN.tar $ cd httpd-NN Notice this is using the tar command we saw in the Backup section Weeks 5-6 33
Configure Now things get messy! The basic configure script, if you’re using the default PREFIX, can be run using $ ./configure The configure script allows you to select which features are active on your host You can also change where specific files are installed, for example Weeks 5-6 34
Apache architecture Apache is a modular server This implies that only the most basic functionality is included in the ‘core’ server Even core functionality can be disabled Extended features are available through modules which can be loaded into Apache Weeks 5-6 35
Apache architecture By default, a base set of modules is included in the server at compile-time If the server is compiled to use dynamically loaded modules, then modules can be compiled separately and added at any time using the LoadModule directive Otherwise, Apache must be recompiled to add or remove modules Weeks 5-6 36
Some types of module status Base A module having "Base" status is compiled and loaded into the server by default Extension A module with "Extension" status is not normally compiled and loaded into the server; to enable the module and its functionality, you need to change the server build configuration files and re-compile Apache External Modules which are not included with the base Apache distribution ("third-party modules") may use the "External" status Weeks 5-6 37
Apache architecture Apache terminology note: Features are implemented by modules, which are installed or not with your copy of Apache Once installed, they can be enabled or disabled to allow them to run or not Dozens of modules are enabled by default, so you’d have to explicitly disable them The most dangerous one is --disable-http Weeks 5-6 38
Apache architecture Likewise, many modules are disabled by default, so you have to enable them explicitly For example, --enable-ssl enables support for SSL/TLS provided by mod_ssl Be very careful, misspelled features are ignored, without error message! --enable-sssl will do nothing Weeks 5-6 39
Configure script vs. file KEY POINT: Apache has a configure script which enables modules ./configure And a configuration file (or several) which contain directives PREFIX/conf/httpd.conf Both are very important and powerful tools, but are completely separate! Weeks 5-6 40
Configure The general syntax for enabling and disabling is --disable-FEATURE Do not include FEATURE; This is the same as --enable-FEATURE=no --enable-FEATURE[=ARG] Include FEATURE; the default value for ARG is yes Weeks 5-6 41
Configure Less often used enabling options include --enable-MODULE=shared The corresponding module will be build as a DSO (dynamically shared) module; will be enabled if you use the --enable-mods-shared option --enable-MODULE=static By default, enabled modules are linked statically; you can force this explicitly Weeks 5-6 42
Packages The configure script can invoke packages, which are typically third party features --with-PACKAGE[=ARG] Use the package PACKAGE; the default value for ARG is yes Often these tell where to find specific libraries or databases Weeks 5-6 43
Environment variables The configure script can also set environment variables These mostly describe what C compiler or flags to use, or the location of compile libraries Weeks 5-6 44
./configure summary So the Apache configure script controls which modules are enabled or not When an ISP tells you they support SSL, Perl, etc., they are implying which modules they installed (if they’re using Apache) Weeks 5-6 45
Build and Install $ make $ make install These are the traditional Unix commands to build and install an app They’ll take a while, especially make, since it includes compiling all the source code Weeks 5-6 46
Customize The file PREFIX/conf/httpd.confis a customization focal point for Apache Apache is configured by placing directives in plain text configuration files Apache configuration files contain one directive per line httpd.conf is the main file, but other config files can be linked from it via an Include directive Weeks 5-6 47
Apache configuration Webmaster’s main control over Apache is through the config file The webmaster has 412 directives at their disposal We’ll get to this soon… No, not all of them  Weeks 5-6 48
Apache directory structure First steps In Apache, what exactly is a “web site” A web site is a directory somewhere on the server Every Apache web site directory contains at least three (and maybe a fourth) subdirectories INFO 321 Weeks 5-6 49
Apache directory structure Regardless of OS, a site directory has conf Contains the important configuration file httpd.conf htdocs Contains the HTML documents, images, data and other files to be served up to the site’s clients These directories and subdirectories, the web space, are accessible to anyone on the Web INFO 321 Weeks 5-6 50