One of the most common files found on most web hosting servers also happens to be one of the least understood, especially for a large number of novice web hosting clients and website administrators. That file, called “
.htaccess,” is used to determine any number of important settings, from the 404 Error page that is displayed to site users to the password applied to certain files or directories on the web server itself. Despite its vast functionality and usability, and the essential functions it enables that can only be setup with an
.htaccess file, a large number of people simply don’t understand the syntax or functions it permits. They’ve never had to master this file, or they’ve never understood the importance of learning to leverage the vast security settings and permissions-based directory access that
.htaccess is most closely associated with.
Website administrators who are looking to get a greater degree of control over their file permissions, directory access, passwords, error pages, and search engine optimization have no better tool in their arsenal than the standard
.htaccess file. This article will attempt to explain the “how” and “why” of the
.htaccess file, and it will make the functionality of this important document pretty obvious to users of all experience levels.
What is an .htaccess file?
Generally speaking, an
.htaccess file is a server document used to define directory or file permissions, URL rewrite rules, or error messages. It is most commonly associated with the display of a 404 Error message when an end user attempts to go to a URL that simply does not exist. Beyond that, the
.htaccess file is commonly used to improve search engine optimization. The use of 301 redirects to guide search engines toward a content’s new location or URL is the chief responsibility of the
.htaccess file and the brief lines of code that it contains.
.htaccess file is also closely associated with the modern permalink structure that dominates content management systems like WordPress. Before URL rewriting become popular, pages and posts would be found at links like the following:
.htaccess file is used to “rewrite” the URL, it uses things like the post’s date, time, category, tags, and other information, to generate a series of readable links with words — rather than numbers — forming the URL. That leads to permalinks like the following example:
Search engines actually respond much better to the second form of URL, and that’s why they’ve become so popular in recent years. It’s also why URL rewriting has become one of the most common uses of the
.htaccess file, even among website owners and content management administrators who otherwise have no knowledge of the file itself. Most content management platforms are actually able to create and modify the
.htaccess document on their own, reducing or eliminating the need for administrator intervention.
.htaccess file is most closely associated with web hosting packages that are based on the Apache architecture. It is common across both Apache 1.x and Apache 2.x instances, and it has virtually the same functions and programming language between these two iterations of the web’s most popular open source server software.
Do I have an .htaccess file, or am I allowed to have one?
The first thing that most people notice about an
.htaccess document is that its filename actually starts with a period rather than with a typical word followed by a file extension. This is done intentionally, as files or folders that begin with a period are considered “hidden” by open source operating systems. That means that the
.htaccess file cannot be seen by traditional visitors to a website, and thus cannot be modified or manipulated for malicious purposes.
This unique way of naming the file does have a side effect that sometimes causes a bit of confusion among users, however. Because it is designed to be hidden, it mostly cannot be seen by using a traditional FTP client on a Mac or Windows desktop operating environment. This makes a lot of newer developers simply assume that the do not have an
.htaccess file at all, and they’ll begin creating a new file to establish their rewrites, error codes, and directory permissions. In many cases, though, web hosts actually create a standard
.htaccess file by default and upload it to the server to be modified by the administrator.
This must be done, typically, by logging into the Plesk Panel or cPanel backend environment and browsing to the
.htaccess file via the web-based file browser. This file browser is designed to show hidden directories and files, and it will display the
.htaccess file in the “public_html” directory even as traditional web browser and FTP application directory listings tend to hide it. From there, it can be edited using the web-based file browser’s built-in text editor; it can then be saved and automatically uploaded to the server at the same time.
In some cases, web hosts actually might not permit their customers to create or modify an
.htaccess file at all. This is often the case on free web hosts or lower-priced operations that are more restrictive on permissions and capabilities. Often, they’ll enable the
.htaccess file at a higher price point in order to encourage the adoption of more expensive web hosting plans. To determine if this is the case, simply use the aforementioned web-based file browser to look for an
.htaccess document in the “public_html” directory. If one does not exist, attempt to create it and upload it to the server. An error message from the server will indicate that the user does not have permission to create this file; the lack of any such error will simply indicate that the host did not create their own
.htaccess file by default, but will not indicate that this file cannot be created and maintained by the customer.
When is it appropriate to invoke the functionality of an .htaccess file?
.htaccess file is best used to code website errors (like 404 or 301 page errors), URL rewrites, or directory permissions. This is appropriate when pointing to new locations of old content, creating an error page for content that no longer exists, requiring a password to access certain files or directories on the server, or pointing complex numerical URLs to more simplistic permalinks using software like WordPress. Here are some typical uses for the
.htaccess file demonstrated an explained:
1. Redirecting an Old Link to a New One Via the 301 Page Error
One of the biggest problems that most websites encounter when moving content is that their search engine ranking drops dramatically right after the move. This is because old content is not properly relocated to the new location, and the search engine itself is never informed that a move took place. Instead, the search engine encounters a 404 Error, which is essentially like telling the search engine that it has arrived at a dead end. That dead end will force the search engine to remove the link from its search results entirely, and the website in question will have to rebuild its reputation as if it is starting from nothing.
To prevent this, search engine optimization professionals everywhere recommend using a 301 redirect as part of the
.htaccess file. This error is known as a “permanent redirect,” and it will point the search engine toward the new location of a file or a specific block of content. In addition, it will inform the search engine that the content has moved on a permanent basis, and will not be moving back. This will prompt the search engine robot in question to update the link in its present search results, rather than remove it entirely and force the website to start over and rebuild its high ranking. Here is what a typical redirect looks like:
Redirect 301 http://domain.com/old-permalink-here/ http://domain.com/archive/new-permalink-here/
The above code should be pretty straightforward but, for those who are confused, it breaks down pretty easily. The first part of the code, “Redirect 301,” establishes the type of error and redirection that will occur when an end user or a search engine robot visits the old location of the content. The first URL represents the old link to that content, while the second URL represents the updated location. While this example uses the same domain name in both URLS (domain.com), the 301 redirect can actually be used to redirect users and search engines to an entirely new domain name.
2. Protecting Files or Directories with .htaccess and .htpasswd
Any user who has ever stumbled upon a directory or file and has been greeted with a password prompt has seen the
.htpasswd files in action. These two files work hand-in-hand to require a valid username and password when accessing a file; if no such password is provided, they simply display a “Restricted Access” error message and the invalid user is turned away. This is the perfect way to protect sensitive files, administration areas, or sensitive login screens, from malicious hackers and other users who might have less-reputable ideas about what they want to do at a given website.
The first step in this process is to create an
.htpasswd file and place it into the directory where the protected file is located. If it’s protecting a given directory, the
.htpasswd file should be placed inside that protected directory. The content of the file is really easy to create, as it contains a username and password separated by a colon. Here’s the basic setup:
That single line of text goes into the
.htpasswd file and that file is immediately uploaded to the server. If there should be more than one user and password combination that can grant access to the file or directory, they can be listed on separate lines with the
To create the authentication screen that requires one of the valid username and password combinations to be entered, site administrators need to add the following series of lines to their
.htaccess file, requiring the password document and assigning it to a given directory or file:
AuthUserFile /server/path/of/.htpasswd AuthType Basic AuthName "The Title of the Protected Page Goes Here" <Files "login.php"> Require ValidUser </Files>
The code above is also pretty easy, as are all implementations of
.htaccess permissions and error messages. All that needs to be done is to point to the relevant
.htpasswd file, specify the file or director being protected, and then give the page a title. That title will appear within the login form that is automatically generated by the browser when it encounters this restriction.
3. Printing a 404 Error for Missing Pages or Directories
Websites, like the technology that hosts them, are constantly evolving. Sometimes, that means entire sites or directories eventually go missing as they are no longer required or their content is simply deemed outdated and not worthy of remaining in the public eye. No matter the reasoning for this change in the site’s structure, it should be accommodated by the
.htaccess file with a custom error page and message. This will let users know that they’ve reached an outdated file, and it will prompt them to stop going there in the future. For search engine robots, this page will tell them to stop indexing the missing page or directory and spend their efforts on parts of the site that still remain available for use.
To specify a customized 404 Error document to be seen by end users and search engine robots when they stumble across an outdated link, simply place the following line into the
ErrorDocument 404 /path/to/the/404-error-page.htm
Simply customize the local server path to the 404 Error page’s location and then save the
.htaccess file to the server. That’s all that will be required; next time an invalid link is pursued by an end user, an intuitive and informative error page will result.
An Important Part of Website Design and Operation
.htaccess file is an essential part of creating friendly URLs, optimizing a website’s search engine performance and position, and notifying users or search engines of outdated links. It should be leveraged to best position a website for better rankings, search results, and revenues, and this is what makes learning about the file so essential for novice website operators. With a little knowledge and a bit of trial and error, the file can be the difference between a website that performs well and one that tends to underperform consistently.