enabling mod_rewrite on apache.

on a basic default apache installation, mod_rewrite doesn’t work by default.
i’ve determined that in all the cases i’ve experienced, it’s because AllowOverride All is not specified (by default, it’s AllowOverride None).
here are other troubleshooting steps to consider (credit to jdMorgan from webmasterworld.com):

  • LoadModule rewrite_module modules/mod_rewrite.so
  • AddModule mod_rewrite.c
  • allowoverride FileInfo Options -or-
  • allowoverride All
  • Options +FollowSymLinks -or-
  • Options +SymLinksIfOwnerMatch -or-
  • Options All
  • paranoid iptables: block that IP range for good.

    as long as your iptables is saved regularly, this command is pretty useful for those IPs that just seem to linger and never go away. i have this problem with IPs in korea.
    as such, i’ve implemented the following “paranoid” iptables rule which i consider pretty helpful to keep them out for good:
    # iptables -t nat -I PREROUTING 1 -s -j DROP
    simply put, this bans the entire 222.122.x.x subnet on the NAT table and prevents any packets from coming in.

    port utilization checkup.

    i run nmap on localhost on a nightly basis and compare the results (which are emailed to me) against the previous night’s. this way, i can tell if something happened at a certain time if a new port mysteriously opens itself.
    today, i encountered an open port on 6010. i investigated who was using them by running the following useful commands, which i am posting here for reference:
    # /usr/sbin/lsof -i TCP:6010
    sshd 21176 user 9u IPv4 13084094 TCP localhost:x11-ssh-offset (LISTEN)

    guess he was using X11, which opens an additional port.
    i further broke this down by looking into the following:
    # /sbin/fuser -name tcp 6010
    here: 6010
    6010/tcp: 24345

    this indicated that process ID (pid) 24345 was doing something funny.
    so i looked into the pid:
    # /usr/sbin/lsof -p 24345
    sshd 24345 user cwd DIR 8,5 4096 2 /
    sshd 24345 user rtd DIR 8,5 4096 2 /
    sshd 24345 user txt REG 8,5 309200 20922628 /usr/sbin/sshd
    sshd 24345 user mem REG 8,5 941024 23234362 /lib/libcrypto.so.0.9.7a
    sshd 24345 user mem REG 8,5 14542 23234382 /lib/libutil-2.3.4.so
    sshd 24345 user mem REG 8,5 63624 3069543 /usr/lib/libz.so.
    sshd 24345 user mem REG 8,5 56328 23232671 /lib/libselinux.so.1

    point being: i now knew the source of the open port, and it was harmless.
    on the other hand, if it was something to wonder about, i’d have killed the process using kill -9 24345 and have figured out the entry point to the server in order to better secure it.

    qmail error resolution: sorry, although i’m listed as a best-preference mx for that host, it isn’t in my control/locals file.

    today, i had to reenable a domain through plesk. once the guy’s site was up and running, he said that he couldn’t receive email. i sent him a test email and had the following message:
    Sorry. Although I’m listed as a best-preference MX or A for that host, it isn’t in my control/locals file, so I don’t treat it as local. (#5.4.6)
    how come? i honestly never saw this problem before.
    well, qmail/plesk stores the hostname in a file located in /var/qmail/control/rcpthosts. i checked and it was there. so what gives?
    my guess is that plesk did things too quickly, or not well enough. i ended up having to restart qmail. after that was done, he began receiving his messages again.

    qmail: 7 day mail queues? too long.

    i’ve been taking a proactive stance in checking the mail queue in my office, since if it gets cluttered with newsletters or unnecessary stuff (including the occasional password phishing from code vulnerabilities in contact forms), it ends up slowing down other emails significantly.
    by default, the qmail queue is 7 days long (604800 seconds). to check that, you can run the following:
    # qmail-showctl | grep queue
    queuelifetime: (Default.) Message lifetime in the queue is 604800 seconds.

    (side point: there’s a lot of cool stuff you can see there related to the qmail setup if you don’t only grep for the queue.)
    in my opinion, 7 days is just way too long. sometimes i’m checking the queue and an email is mailed to a wrong address… and the email just sits there while the mailserver repeatedly attempts to send the message to this nonexistent address. (for example, if you’re looking to email someguy@aol.com and you accidentally addressed it with the domain aol.org, you’ll be waiting a long time for a bounceback, which might cause frustration and anger because you thought you sent it to the right guy to begin with.)
    everything on linux can be tweaked, and it’s relatively easy to do at times. in this particular case, what is needed is a newly created file, /var/qmail/control/queuelifetime, which contains a single line: the number of seconds that you want the queue to last. in my case, i made it 172800 seconds (2 full days; a single day is 86400), so these emails get returned to sender informing them that they should get the right address or try later.
    once you run this file, you can verify that the new queue length is in effect by running the following:
    # /var/qmail/bin/qmail-showctl | grep queue
    queuelifetime: Message lifetime in the queue is 172800 seconds.

    note how it doesn’t say “Default” anymore like the previous execution of the same command did.
    to force those old emails to be sent? just run qmHandle -a and you’ll notice that the queue (qmHandle -l) has gotten a lot shorter.
    if you don’t have qmHandle, you can get it on sourceforge; just click here. it’s not part of the regular qmail distribution. more information on qmHandle can be found in this blog entry.

    robots.txt and spidering.

    when you have content that is not for public consumption, you should always be safe than sorry by preventing the search engines from crawling (or spidering) the page and learning your link structure. for example, in a development environment, it would hardly be useful for the page to be viewed as if it’s a public site when it’s not ready yet.
    enter robots.txt. this file is extremely important; search engines look for that file and determine whether the site can be entered into its search cache or if you want to keep it private.
    the basic robots.txt file works like this: you stick the file in the root of your website (e.g. the public_html or httpdocs folder. it won’t work if it’s located anywhere else or in a subdirectory of the site.
    the crux of the robots.txt is the User-Agent and disallow directives. if you don’t want any search engine bots to spider your any files on your site, the basic file looks like this:
    User-agent: *
    Disallow: /

    however, if you don’t want the search engines to crawl a specific folder, e.g. www.yoursite.com/private, you would create the file as so:
    User-agent: *
    Disallow: /private/

    if you don’t want google to spider a specific folder called /newsletters/, then you would use the following:
    User-agent: googlebot
    Disallow: /newsletters/

    there are hundreds of bots that you’d need to consider, but the main ones are probably google (googlebot), yahoo (yahoo-slurp), and msn (msnbot).
    you can also target multiple user-agents in a robots.txt file that looks like this:
    User-agent: *
    Disallow: /
    User-agent: googlebot
    Disallow: /cgi-bin/
    Disallow: /private/

    there’s a great reference on user agents on wikipedia. another great resource is this robots.txt file generator.
    where security is concerned, a robots.txt file makes a huge difference.

    showing and understanding mysql processes in detail.

    i’ve learned a little trick on how to determine how your mysql server is running and where to pinpoint problems in the event of a heavy load. this is useful in determining how you might want to proceed in terms of mysql optimization.
    # mysql -u [adminuser] -p
    mysql> show processlist;

    granted, on a server with heavy volume, you might see hundreds of rows and it will scroll off the screen. here are the key elements to the processlist table: Id, User, Host, db, Command, Time, State, Info, where:
    Id is the connection identifier
    User is the mysql user who issued the statement
    Host is the hostname of the client issuing the statement. this will be localhost in almost all cases unless you are executing commands on a remote server.
    db is the database being used for the particular mysql statement or query.
    Command can be one of many different commands issued in the particular query. the most common occurrence on a webserver is “Sleep,” which means that the particular database connection is waiting for new directions or a new statement.
    Time is the delay between the original time of execution of the statement and the time the processlist is viewed
    State is an action, event, or state of the specific mysql command and can be one of hundreds of different values.
    Info will show the actual statement being run in that instance
    another useful command is:
    mysql> show full processlist;
    which is equivalent to:
    mysqladmin -u [adminuser] -p processlist;
    this shows my specific query as:
    | 4342233 | adminusername | localhost | NULL | Query | 0 | NULL | show full processlist |

    or you can display each field in a row format (vertical format), like so, simply by appending \G to the end of the query:
    mysql> show full processlist\G
    this list is very likely preferable in the event that your data scrolls off the screen and you want to find out the specific field name of a value in your database.
    ******** 55. row ********
    Id: 4342233
    User: adminusername
    Host: localhost
    db: NULL
    Command: Query
    Time: 0
    State: NULL
    Info: show full processlist

    you can also check how many mysql queries a user has open by running the following command:
    mysqladmin -u [adminuser] -p pr | awk -F\| {‘print $3’} | sort -nk1 | awk -F_ {‘print $1’} |uniq -c |sort
    to see which database has the most active queries, run the following:
    mysqladmin -u [adminuser] -p pr | awk -F\| {‘print $3’} | sort -nk1 |uniq -c |sort
    oh, and since it’s useful… here’s a recommend /etc/my.cnf:

    slave_net_timeout = 50
    delayed_insert_timeout = 50

    another fine tuning would include the following and is good for machines with plesk:

    key_buffer = 128M
    max_allowed_packet = 1M
    table_cache = 512
    sort_buffer_size = 2M
    read_buffer_size = 2M
    read_rnd_buffer_size = 8M
    myisam_sort_buffer_size = 64M
    thread_cache_size = 8
    query_cache_size = 64M
    thread_concurrency = 8

    the above will help you optimize your mysql database as well, but the configuration isn’t for everyone.

    preventing against ddos attacks.

    what is a ddos attack, you ask? a distributed denial of service (ddos) attack is when multiple computers try to flood your server with thousands of connections with the goal in mind to bring your server down for a good chunk of time.
    a lot of people fall victim to these attacks daily.
    they don’t have to.
    (d)dos-deflate is an open-source tool that will prevent against any denial of service attacks. you can download it here.
    all of the configuration files by default get stored in /usr/local/ddos/ddos.conf.
    i’ve personally tweaked the system to ban the IP for a little longer than the default 600 seconds, and of course, don’t forget to change the email address so that the warnings go to you. (you wouldn’t want your IP being blocked accidentally and have your email warnings go to a possibly unchecked email address!)
    you can also whitelist IP addresses by adding them, line by line, to /usr/local/ddos/ignore.ip.list.

    resolving canonical issues with plesk.

    in the world of SEO (search engine optimization), there is an unwritten rule (well, it will be written sooner or later) that you can’t have duplicate content on google search engines from the same site. this means that http://www.domain.com and http://domain.com cannot both be found by search engines. you must choose one or the other or you may face a penalty.
    there’s an easy solution for this using vhosts in plesk. the only not-so-user-friendly part about this that you have to do it for every domain you are worried about, and with 100+ domains, you’ll be making 100+ (or 200+ files if you have SSL support as well) vhost files for each domain.
    in any event, this is how it’s done.
    navigate on your plesk server to your domain’s conf directory. on some machines, it’s
    # cd /var/www/vhosts/domain.com/conf
    i prefer going through this shortcut:
    # cd /home/httpd/vhosts/domain.com/conf
    regardless, both are symbolically linked — or they should be in certain setups.
    create the file vhost.conf
    # vi vhost.conf
    add the following to the vhost.conf file
    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^www\. [NC]
    RewriteRule ^(.*)$ http://www.%{HTTP_HOST}$1 [QSA,R=301,L]

    for domains with SSL support, you will need to create a file called vhost_ssl.conf as well.
    # vi vhost_ssl.conf
    RewriteEngine On
    RewriteCond %{HTTP_HOST} !^www\. [NC]
    RewriteRule ^(.*)$ https://www.%{HTTP_HOST}$1 [QSA,R=301,L]

    that’s it! now, run this plesk command to process your update.
    # /usr/local/psa/admin/bin/websrvmng -av
    load your page in your preferred web browser as http://domain.com. it will automatically redirect to http://www.domain.com and will be reflected in search engines with the www prefix only.

    rkhunter … doesn’t support redhat ES 4 (nahant update 3)?

    actually, it does. but version 1.28 (the latest version as of this writing) doesn’t recognize it.
    if you’re running rkhunter and get the following message:
    Determining OS… Unknown
    Warning: This operating system is not fully supported!
    Warning: Cannot find md5_not_known
    All MD5 checks will be skipped!

    you can get rkhunter to acknowledge your OS by doing the following:
    # cd usr/local/rkhunter/lib/rkhunter/db
    # pico os.dat

    (i’m still a fan of vi, but i’m trying to be tolerant) 🙂
    in this file, look for like 189. add this line immediately below as such:
    190:Red Hat Enterprise Linux ES release 4 (Nahant Update 3):/usr/bin/md5sum:/bin
    save the file and then run rkhunter -c once again.
    no errors!