So what does this article have that the billion others on Google don’t? Optimisations.
Most tutorials do not include optimisations, especially compiler optimisations, which increase program efficiency greatly. Note that you won’t see the effect of these optimisations unless your server handles over 100 requests per second.
Again, the compiler optimisations I use are considered very bad/buggy according to articles on the Internet — but who cares? We want results!
The optimisations used here are highly dependent on the CPU, RAM and CPU cache of the system for which the application is being compiled. gcc -O3
increases the binary size. This requires a large CPU cache (at least larger than what is required by a binary compiled with -O2
). If the cache is small, the application is moved between the CPU cache and DRAM, resulting in application slow down.
We won’t be using any of the default/official packages for Apache, MySQL or PHP. Those are optimised for general processors, and do not perform well when compiled specifically for your processor.
In this article, we will not cover the installation of Linux itself, since there are many flavours that are used on servers; your server may have one preinstalled. If not, I assume that you know how to deal with installing Linux and any extra packages, when required.
Installation of Apache (httpd)
As mentioned earlier, there are two things that can be tuned in httpd: compiler optimisation and atomic operations. As far as I know, only httpd in the LAMP package has this atomic operations feature, which is used for light-weight thread synchronisation (which must be supported by the CPU, and is the case for most newer processors).
Dowload the httpd source tarball from the official website. Next, check for the integrity of the downloaded file with the source checksum from Apache’s official mirror, ensuring the code doesn’t contain modifications not known to Apache developers.
Compilation settings in the environment
The compilation method for almost every C program distributed for the Linux family is to use GNU autotools. Yes, there are some variations — in our case, the latest release of MySQL uses cmake
— but we’ll deal with that in the article on installing MySQL. Usually, cmake
is used for applications written in C++, but is not constrained to these. The configure script generated when developers use GNU autotools recognises certain environment variables that need to be passed to relevant tools: the compiler/linker, etc.
The CFLAGS
environment variable can contain extra options that need to be passed to the C compiler. We set this as follows:
$ export CFLAGS="-O3 -march=native -mtune=native"
Configuration
$ ./configure --enable-mods-shared="all ssl cache proxy authn_alias mem_cache file_cache charset_lite dav_lock disk_cache" --enable-nonportable-atomics=yes
Here, we run the configure script that’s distributed with the source tree in the tarball, to configure the source tree. The --enable-mods-shared=...
parameter is to compile the listed modules as Dynamic Shared Objects (DSOs), so that they can be enabled and disabled at will, after installation, by simply adding/removing a line in the configuration file.
The --enable-mods-shared=all
doesn’t actually do this for all modules; some are left out, and I have named those here.
In the above command, I have omitted two modules, ldap
and authnz_ldap
, which enable LDAP-based authentication. You should add the --with-ldap
flag to the command in case you add these to the module list.
Once the configure script has run successfully (after eliminating any errors), proceed to compilation.
Compilation
make
If you have a multi-core processor — a dual-core or more — you can speed up the compilation by running it in parallel; instead of just running make
, run make -jN
, where N
is the number of cores you have.
Installation
The last step, installation (copying the compiled binaries to their respective locations), is again made easy by the make tool:
$ sudo make install
or,
$ su -c 'make install'
Installing files to their default locations, that is, /usr/local/apache2
, will require root permissions. If you have sudo
(installed on every Ubuntu machine) you can use that, or if not, you will have to use su -c
, in which case you need to know the root password.
You can issue the above command without sudo
or su -c
in case you supplied a path to the configure script (--prefix=PATH
) that is writeable by your account.
So, now that you have installed Apache, let’s move on to tweaking its configuration.
Apache configuration files
Apache uses a specific configuration file syntax, commonly found in many Linux applications.
The syntax includes certain directives, like EnableSendFile On
, and some sections that resemble a kind of mark-up language, like what follows:
<Directory /path/to/directory> # Directory configuration </Directory>
The #
denotes a comment, like in shell scripts.
--prefix=/usr/local/apache2
(which is the default, if you don’t specify this option), Apache places your configuration files in /usr/local/apache2/conf
. If you passed any custom path prefix to the configure script, or a custom location for etcdir
, then your configuration files will be in the conf
subdirectory of your custom prefix path, or in the custom etcdir
location you’ve given. We will assume the default location.Apart from the main configuration files in /usr/local/apache2/conf
, it is possible to have directory-specific configuration placed inside the directories themselves. These do not contain a large amount of configuration, but are quite useful in case you want to change one single setting for one directory, when it makes no sense to add a <Directory>
section to the main configuration file and restart/reload Apache. These files are named .htaccess
, and it is possible to allow/disallow usage of one such file.
Here, I’ll discuss only the main aspects of the configuration file — the important performance- and security-related options, along with certain basic ones. For all other options, you can find comprehensive documentation at the official Apache documentation website.
Directives, their usage and effects
Listen
This directive tells Apache which ports and IP addresses to listen on for requests. Many people miss this during configuration of virtual hosts, and particularly SSL hosts. Unnecessarily adding Listen directives will cause Apache to listen on those ports, causing a security threat.
The syntax is:
Listen [<ip-address>]:<port>
For example:
Listen 80 Listen 11.22.33.44:80
User and Group
These two control which user and group Apache runs as. This is very important for the security of the server. If you tell Apache to run as the root user and group, you are giving the Web server the power of the systems administrator! It makes no sense to do that, unless you are a developer and are testing something. Never do this.
The ideal value for user and group is www:www
. With this, you can control which files Apache can write to, which files it has read-only access to, and so on, without ever needing to do a chmod 777! Example:
User www Group www
Timeout
This is the time period that Apache should wait before dropping a request due to a very long send or receive operation.
For a more precise definition (taken from the Apache manual), Timeout controls the following:
- The total time Apache takes to receive a GET request.
- The amount of time between the receipts of TCP packets on a POST or PUT requests.
- The amount of time between ACKs on transmissions of TCP packets, in responses.
The default value for Timeout is 300 seconds (specified as Timeout 300). At the time of writing, there’s no way to configure separate timeouts for each element mentioned above.
KeepAlive
This option specifies if Apache should allow multiple requests per connection (i.e., persistent connections, when KeepAlive On
is specified). This should be kept on for performance reasons; otherwise, every time a new request arrives (multiple requests from the same client), Apache creates a new connection, which has a lot of overhead.
As a simple analogy, consider a group of 100 people waiting to enter through the door of some institution. If the watchman closes and opens the door once for each person, it would take much more time and effort, compared to letting all 100 people enter at one time.
MaxKeepAliveRequests
This option specifies the maximum number of requests that will be permitted on a single connection (if enabled using KeepAlive
), before Apache closes the connection and initiates a new one for a new request from the same client. This number is better left high, for performance reasons. The default value is 0, which means “unlimited”. For example, you could specify MaxKeepAliveRequests 1000
.
KeepAliveTimeout
This option specifies the maximum time (in seconds) that Apache should wait for a subsequent request from the same client, on an existing persistent connection, before freeing up the process for another client. Continuing with our simple analogy, this is the maximum time the watchman should keep the gate open, waiting for the arrival of another person who also wishes to enter.
This value should be between 5 and 15. Setting it to less than 5 makes no sense, since a browser could easily request another file from the server within 5 seconds; setting it higher than 15 seconds would unnecessarily tie up an Apache process, which could be used in handling another client.
Example:
KeepAliveTimeout 5
AccessFileName
This option specifies the name of the file that must be read, if present in a directory, to apply directory-specific options. As mentioned earlier, the name is usually .htaccess
(i.e., AccessFileName .htaccess
).
There are performance considerations with the use of these configuration files. If you have set any of the options permitted by the AllowOverride
directive (see below) in the relevant <Directory>
section, Apache will check for .htaccess
files in every directory from the top-most specified in <Directory>
up to the last directory (in which the requested file resides).
This causes a lot of overhead, and is not recommended for a server receiving a large number of requests per second. A workaround, if you really want the .htaccess
feature (as is the case with shared hosts), is to have a specific <Directory>
section where all your documents reside, instead of enabling that option in the parent directory.
AllowOverride
This directive lets you permit certain configuration options in files of the name given in AccessFileName
, according to the values specified in this directive. It can appear only in <Directory>
sections specified without regular expressions (not containing ~
), and nowhere else.
The values it takes, and what it does, is given in the following table, which is taken from the official Apache documentation at here.
AllowOverride directive values | |
Directive value | Effect |
AuthConfig | Allows directives related to authentication and authorisation |
FileInfo | Allows directives controlling document types |
Indexes | Allows directives controlling directory indexing, i.e., listing of files when a directory is requested |
Limit | Permits the use of Allow, Deny and Order directives |
Options[=Option,…] | Allows directives controlling specific directory features |
HostnameLookups
Enabling this option gives you hostnames in the access log, instead of just IP addresses. This is a big performance hit, since every time a client initiates a connection, Apache will send a request to the DNS server to convert the IP address of the client to a host-name.
Disable this (HostnameLookups Off
). If you want hostnames in the log file, use the logresolve
utility provided with Apache instead, to resolve IP addresses when you are reading the log.
Configuration sections
As I told you earlier, the configuration file contains certain sections that resemble HTML markup. Here, I list and explain them.
IfModule
This checks if a particular module is loaded into memory (enabled) or not. It basically tells Apache to parse the configuration inside the section only if the module is loaded; else, to skip it. For example:
<IfModule fcgid_module> AddHandler fcgid-script .php </IfModule> <IfModule !fcgid_module> <IfModule fastcgi_module> AddHandler fastcgi-script .php </IfModule> </IfModule>
IfModule
sections can be nested, as seen above. The bang in <IfModule !module-identifier>
is used to negate — i.e., include the configuration section if the module is not loaded.
IfDefine
Checks if a particular parameter was defined while starting Apache, and is usually used to load modules according to the startup command, eliminating the need to modify configuration files every now and then. For example:
<IfDefine Rewrite> LoadModule modules/mod_rewrite.so </IfDefine>
In this example, the rewrite module will be loaded if Apache was launched with the command:
apachectl -D Rewrite start
… or:
httpd -D Rewrite -k start
Directory, Files, FilesMatch, Location, and LocationMatch
These directives are used to control configuration related to the particular elements — Directory, Files, and Location. The difference between them is that Directory will match a physical directory but not a symlink (symbolic link). Files will also match symlinks.
Location has no limitations — it will match a file, location, symlink, alias, etc.
The regular-expression variants for these have Match’ appended to the directive. FilesMatch
is the regular-expression variant of Files, and so on. It is possible to use simple shell wildcards like *
and ?
in the non regular-expression variants, as follows:
<Files ~ .*> # configuration </Files>
There is no need to use the regular-expression variants to match shell wildcards, since those variants have more processing overheads, and slow down Apache.
The most important directive in relation to this is the Options directive. It controls what features should be enabled, disabled, permitted or not permitted for all the elements matching the regex (files or directories, as applicable). Read the Apache official documentation for more information on this.
In the next part, we’ll cover MySQL.
Thank you Linux For You
amazing tutorial
gud 1 !
ultimate..
The best of the best guide to contemplate Linux
I would not recommend compiling Apache on your own. Don’t bother about Optimisations. You could always double or quadruple performance by clustering Apache servers.
The problem I see with your method is maintenance. A web server needs to be secure. Let the distribution vendor handle compiling Apache and all the dependencies (modules, php, etc etc). How many of the busy sysadmins have time to recompile all the apache and dependencies with each vuln or patch released?
There is a distro called Gentoo which takes care of that. I use Gentoo
on all my servers. Also, adding more servers means increasing cost.
excelent
Ronaldo is the best , I think he deserves the award.