Back to description
Googling for information on the World Wide Web is such a common activity these days that it is hard to imagine that just... more
Googling for information on the World Wide Web is such a common activity these days that it is hard to imagine that just a few years ago this verb did not even exist. Search engines are now an integral part of our lifestyle, but this was not always the case. Historically, systems for finding information were driven by data organization and classification performed by humans. Such systems are not entirely obsolete libraries still keep their books ordered by categories, author names, and so forth. Yahoo! itself started as a manually maintained directory of web sites, organized into categories. Those were the good old days.
Today, the data of the World Wide Web is enormous and rapidly changing; it cannot be confined in the rigid structure of the library. The format of the information is extremely varied, and the individual bits of data coming from blogs, articles, web services of all kinds, picture galleries, and so on form an almost infinitely complex virtual organism. In this environment, making information findable necessitates something more than the traditional structures of data organization or classification.
Introducing the ad-hoc query and the modern search engine. This functionality reduces the aforementioned need for organization and classification; and since its inception, it has been become quite pervasive. Google’s popular email service, GMail, features its searching capability that permits a user to find emails that contain a particular set of keywords. Microsoft Windows Vista now integrates an instant search feature as part of the operating system, helping you quickly find information within any email, Word document, or database on your hard drive from the Start menu regardless of the underlying file format. But, by far, the most popular use of this functionality is in the World Wide Web search engine.
... less
Although this book addresses search engine optimization primarily from the perspective of a web site’s architecture, you,... more
Although this book addresses search engine optimization primarily from the perspective of a web site’s architecture, you, the web site developer, may also appreciate this handy reference of basic factors that contribute to site ranking. This chapter discusses some of the fundamentals of search engine optimization.
If you are a search engine marketing veteran, feel free to skip to Chapter 3. However, because this chapter is relatively short, it may still be worth a skim. It can also be useful to refer back to it, because our intent is to provide a brief guide about what does matter and what probably does not. This will serve to illuminate some of the recommendations we make later with regard to web site architecture.
This chapter contains, in a nutshell:
A short introduction to the fundamentals of SEO.
A list of the most important search engine ranking factors.
Discussion of search engine penalties, and how you can avoid them.
Using web analytics to assist in measuring the performance of your web site.
Using research tools to gather market data.
Resources and tools for the search engine marketer and web developer.
“Click me!” If the ideal URL could speak, its speech would resemble the communication of an experienced salesman. It would... more
“Click me!” If the ideal URL could speak, its speech would resemble the communication of an experienced salesman. It would grab your attention with relevant keywords and a call to action, and it would persuasively argue that you should choose it instead of the other one. Other URLs on the page would pale in comparison.
URLs are more visible than many realize, and are a contributing factor in CTR. They are often cited directly in copy, and they occupy approximately 20% of the real estate in a given search engine result page. Apart from “looking enticing” to humans, URLs must be friendly to search engines. URLs function as the “addresses” of all content in a web site. If confused by them, a search engine spider may not reach some of your content in the first place. This would clearly reduce search engine friendliness.
Creating search engine friendly URLs becomes challenging and requires more forethought when developing a dynamic web site. A dynamic web site with poorly architected URLs presents numerous problems for a search engine. On the other hand, search engine friendly URLs containing relevant keywords may both increase search engine rankings, as well as prompt a user to click them.
This chapter discusses how well-crafted URLs can make the difference between highly ranked web pages and pages at the bottom of the search results. It then illustrates how to generate optimized URLs for dynamic web sites using the Apache mod_rewrite module in coordination with application code written in PHP. Lastly, this chapter considers some common caveats and addresses how to avoid them.
By the end of this chapter you will acquire the skills that will enable you to employ search engine friendly URLs in a dynamic PHP-based web site. More specifically, in the rest of this chapter you will:
Understand the differences between static URLs and dynamic URLs.
Understand the benefits of URL rewriting.
Use mod_rewrite and regular expressions to implement URL rewriting.
Follow exercises to practice rewriting numeric and keyword-rich URLs.
Create a PHP “link factory” library to help you keep the URLs in your site consistent.
One of the perks of PHP is that it abstracts away many low-level implementation details from the web developer. It does such... more
One of the perks of PHP is that it abstracts away many low-level implementation details from the web developer. It does such a great job, in fact, that you can typically build complex web applications without understanding much at all about the protocol web servers used to speak to the world, HTTP (HyperText Transport Protocol).
Though most of the time this ignorance is bliss, it is sometimes not so with regard to search engine optimization. Using the protocol improperly has the potential to wreak havoc for search engine rankings. On the other hand, knowing how to use it effectively can be of great help to the very same end.
HTTP status codes are a small but critical part of this protocol. They provide information regarding the state of an HTTP request. You can use them, for example, to indicate that the requested information should be retrieved from a different location henceforth. In modern search engines, doing so also may also result in a transference of link equity to that new location. This example alone highlights the importance of knowing how to use these codes.
In this chapter you will:
Learn about the HTTP status codes that are pertinent to the search engine marketer.
Understand how to use the redirection status codes properly, how to signal deleted pages, and how to avoid indexing errors.
Learn how to implement redirection using PHP and mod_rewrite.
Follow step-by-step exercises to implement automatic URL correction and canonicalization.
We humans often find it frustrating to listen to people repeat themselves. Likewise, search engines are “frustrated” by web... more
We humans often find it frustrating to listen to people repeat themselves. Likewise, search engines are “frustrated” by web sites that do the same. This problem is called duplicate content, which is defined as web content that is either exactly duplicated or substantially similar to content located at different URLs. Duplicate content clearly does not contain anything original.
This is important to realize. Originality is an important factor in the human perception of value, and search engines factor such human sentiments into their algorithms. Seeing several pages of duplicated content would not please the user. Accordingly, search engines employ sophisticated algorithms that detect such content and filter it out from search engine results.
Indexing and processing duplicate content also wastes the storage and computation time of a search engine in the first place. Aaron Wall of http://www.seobook.com/ states that “if pages are too similar, then Google [or other search engines] may assume that they offer little value or are of poor content quality.” A web site may not get spidered as often or as comprehensively as a result. And though it is an issue of contention in the search engine marketing community as to whether there is an explicit penalty applied by the various search engines, everyone agrees that duplicate content can be harmful.
http://www.seobook.com/
Knowing this, it would be wise to eliminate as much duplicate content as possible from a web site. This chapter documents the most common causes of duplicate content as a result of web site architecture. It then proposes methods to eliminate or remove it from a search engine’s view. You will:
Understand the potential negative effects of duplicate content.
Examine the most common types of duplicate content.
Learn how to exclude duplicate content using robots.txt and meta tags.
robots.txt
meta
Use PHP code to properly implement an affiliate program.
A common question asked by search engine marketers is “how much duplicate content is too much?” There is no good answer to that question, as you may have predicted. It is best to simply take the conservative approach of eliminating as much of it as possible.
In a perfect world, a web site’s presentation details would not affect its search engine rankings more so than it affects... more
In a perfect world, a web site’s presentation details would not affect its search engine rankings more so than it affects a human visitor’s perception of value his “rankings.” Relevant content is what users are after, and the goal of a search engine is to provide it. In this perfect world, web pages that contain the same information would rank similarly regardless of the on-page technologies used in their composition.
Unfortunately, in many cases, quite the opposite is true. Using Flash or AJAX to present information, for example, may render much of your web site invisible to search engines. Likewise, using JavaScript-based links for navigation may bring about the same unfortunate result.
The good news, however, is that applying a deep understanding of these presentation concerns will yield an advantage for you over other web sites that exhibit more naiveté. This chapter explores these concerns. It provides solutions and outlines best practices for web site content presentation.
By the end of this chapter you will acquire knowledge that will enable you to use on-page technologies effectively without detriment to search engine rankings. This chapter will teach you how to:
Implement SE-friendly JavaScript site functionality.
Generate crawlable images and graphical text using two techniques.
Improve the search engine-friendliness of your HTML.
Analyze when and how to use AJAX and Flash in your web site.
You’ve just added some great new content to your web site. Now what? Of course, your current visitors will appreciate the... more
You’ve just added some great new content to your web site. Now what? Of course, your current visitors will appreciate the content. They may even tell a few friends about it. But there are technologies that you can leverage to facilitate and encourage them to do some free marketing for you.
This chapter explores web feeds and social bookmarking, two technologies that web site visitors can use to access and promote content that they enjoy. Encouraging visitors to do so is a vital part of viral marketing. This chapter discusses various ways to accomplish this, and walks you through three exercises where you:
Create your own RSS feeds.
Syndicate RSS and Atom feeds.
Add social bookmarking icons to your pages and feeds.
It may sound quite obvious, but system administrators those who manage the computers that host your web site, for example... more
It may sound quite obvious, but system administrators those who manage the computers that host your web site, for example, must be acutely aware of computer security concerns. When a particular piece of software is indicated to be vulnerable to hackers, they should find out quickly because it is their priority to do so. Then they should patch or mitigate the security risk on the servers for which they are responsible as soon as possible. Consequently, it may also not surprise you that some of the best system administrators used to be hackers, or are at least very aware of what hacking entails.
Why is this relevant? Although it is totally unfair to compare “black hat” search engine marketers to hackers on an ethical plane, the analogy is useful. The “white hat” search engine marketer that is, a search engine marketer who follows all the rules, must be aware of how a “black hat” operates.
Understanding black hat techniques can help a webmaster protect his or her web sites. Nobody, after all, wants to be caught with his pants down advertising “cheap Viagra.” In this chapter you learn how to avoid such problems. In this chapter you will:
Learn about black hat SEO.
Learn about the importance of properly escaping input data.
Learn how to automatically add the nofollow attribute to comment links.
Sanitize input data by removing unwanted tags and attributes.
Request human input to protect against scripts adding comments automatically.
Protect against redirect attacks.
There is quite a bit to go through, so we’d better get started!
A sitemap provides an easy way for both humans and search engines to reference pages of your web site from one central location... more
A sitemap provides an easy way for both humans and search engines to reference pages of your web site from one central location. Usually, the sitemap enumerates all, or at least the important, pages of a site. This is beneficial for humans in that it can be a navigational aide, and for search engines, because it may help a web site get spidered more quickly and comprehensively.
In this chapter you learn about:
The two types of sitemaps: traditional sitemaps and search engine sitemaps.
The Google XML sitemaps standard.
The Yahoo! plaintext sitemaps standard.
The new sitemaps.org standard soon to be implemented by all search engines.
You’ll implement PHP code that generates both Google and Yahoo! search engine sitemaps programmatically. But first, this chapter starts at the beginning and talks about traditional sitemaps.
Link bait is any content or feature within a web site that is designed to bait viewers to place links to it from other web... more
Link bait is any content or feature within a web site that is designed to bait viewers to place links to it from other web sites. Matt Cutts defines link bait as “something interesting enough to catch people’s attention.” Typically, users on bulletin boards, newsgroups, social bookmarking sites, or blogs will place a link to a web site in some copy that further entices a fellow member or visitor to click. It is an extremely powerful form of marketing because it is viral in nature, and links like these are exactly what a search engine algorithm wants to see that is, votes for a particular web site.
Soliciting links via link-exchanging is less effective than it once was to the end of improving rankings, as discussed in Chapter 2. Link bait creation is one of the newer popularized concepts in link building. In the article at http://www.seomoz.org/blogdetail.php?ID=703, Rand Fishkin of SEOmoz states “… I’d guess that if Matt (from Google) or Tim (from Yahoo!) had their druthers, this would be the type of tactic they’d prefer were used to achieve rankings.” It is frequently, with a lot of luck and some skill, an economical and ethical way to get links to a web site; it is considered to be a white hat search engine optimization technique universally.
http://www.seomoz.org/blogdetail.php?ID=703
This chapter introduces the link bait concept, then shows an example of what we term “interactive link bait,” which is an application that garners links naturally and virally.
Note
As discussed in Chapter 2, building links is a crucial part of any search engine optimization campaign. In general, a site that earns links over time will be seen as valuable by a search engine. Link bait, in reality, is not a new concept. People have been linking to things that they like since the inception of the World Wide Web. It is just a concise term that describes an extremely effective technique “provide something useful or interesting in order to entice people to link to your web site.”
Cloaking is defined as the practice of providing different content to a search engine spider than is provided to a human... more
Cloaking is defined as the practice of providing different content to a search engine spider than is provided to a human user. It is an extremely controversial technique in the realm of search engine optimization. And like most things controversial, cloaking can be used for both good and evil. It is discussed in depth in this chapter, along with a discussion of the controversy surrounding its use. Geo-targeting is a similar practice, but it provides different content to both spiders and users on the basis of their respective geographical regions its use is far less controversial. Both practices are typically implemented using a technology called IP delivery.
In this chapter, you:
Learn the fundamentals of cloaking, geo-targeting, and IP delivery.
Implement IP delivery–based cloaking and geo-targeting in step-by-step exercises.
Incidentally, the authors of this book are from two different countries. Jaimie is from the United States and speaks English... more
Incidentally, the authors of this book are from two different countries. Jaimie is from the United States and speaks English, along with some Hebrew and Spanish. Cristian is from Romania and speaks Romanian, English, and some French. Why does this matter? There are concerns both from a language angle, as well as some interesting technical caveats when one decides to target foreign users with search engine marketing. This section reviews some of the most pertinent factors in foreign search engine optimization.
As far as this book is concerned, “foreign” refers to anything other than the United States because this book is published in the United States. We consider the UK to be foreign as well; and UK English is a different language dialect, at least academically.
The Internet is a globalized economy. Web sites can be hosted and contain anything that the author would like. Users are free to peruse pages or order items from any country. Regardless, for the most part, a user residing in the United States would like to see widgets from the United States. And a user in Romania would like to see widgets from Romania. It is also likely that a user in England would prefer to see products from England, not the United States regardless of the language being substantially the same. There are some exceptions, but in general, to enhance user experience, a search engine may treat web sites from the same region in the same language as the user preferentially.
This chapter deals with a few common technical issues that relate to SEO efforts:... more
This chapter deals with a few common technical issues that relate to SEO efforts:
Unreliable hosting or DNS
Changing hosting providers
Cross-linking
Split testing
Broken links (and how to detect them)
You’ve come a long way in learning how to properly construct a web site with regard to search engine optimization. Now it... more
You’ve come a long way in learning how to properly construct a web site with regard to search engine optimization. Now it is time to demonstrate and tie together what you have learned. This chapter demonstrates an e-commerce store called “Cookie Ogre’s Warehouse.” This store sells all sorts of cookies and pastries. You implement what relates to search engine optimization, but the store will not have a functional shopping cart or checkout process.
In this chapter you:
Develop a set of requirements for a simple product catalog
Implement the product catalog using search engine–friendly methods
You’ll notice that the site you’re building in this chapter is very basic, and highlights only the most important SEO-related principles taught in this book. The simplicity is necessary for the purposes of this demonstration, because a complex implementation could easily be extended throughout an entire book itself.
Tip
To learn how to build a real-world search engine–optimized product catalog from scratch, and learn how to design its database and architectural foundations to allow for future growth, see Cristian’s Beginning PHP and MySQL E-Commerce: From Novice to Professional, 2nd edition (Apress, 2007).
Although we recommend otherwise, many web sites are initially built without any regard for search engines. Consequently,... more
Although we recommend otherwise, many web sites are initially built without any regard for search engines. Consequently, they often have a myriad of architectural problems. These problems comprise the primary focus of this book. Unfortunately, it is impossible to exhaustively and generally cover the solutions to all web site architectural problems in one short chapter. But thankfully, there is quite a bit of common ground involved.
Likewise, there are many feature enhancements that web sites may benefit from. Some only apply to blogs or forums, whereas others apply generally to all sites. Here, too, there is quite a bit of common ground involved. Furthermore, many such enhancements are easy to implement, and may even offer instant results.
This chapter aims to be a useful list of common fixes and enhancements that many web sites would benefit from. This list comprises two general kinds of fixes or enhancements:
Items 1 through 9 in the checklist can be performed without disturbing site architecture. These items are worthwhile for most web sites and should be tasked without concern for detrimental effects.
Items 10 through 15 come with caveats when implemented and should therefore be completed with caution or not at all.
This chapter is not intended to be used alone. Rather, it is a sort of “alternative navigation” scheme that one with a preexisting web site can use to quickly surf some of the core content of this book. Appropriate references to the various chapters in this book are included with a brief description. Eventually, we hope that you read the book from cover to cover. But until then, dive in to some information that you can use right away!
WordPress is a very feature-rich and extensible blogging application that, with a bit of tweaking, can be search engine–friendly... more
WordPress is a very feature-rich and extensible blogging application that, with a bit of tweaking, can be search engine–friendly. It is entirely written in PHP, and it can be modified and extended in the same. Duplicating even its core functionality in a custom application would take a lot of time so why reinvent the wheel? Furthermore, many plugins are readily available that extend and enhance its functionality.
Because the blog has been mentioned as a vehicle for search engine marketing so many times in this book, it seems fitting to document the process of setting up a blog with WordPress. You will also install quite a few WordPress plugins on the way.
Please note that this is not a comprehensive WordPress tutorial: although we present step-by-step installation and configuration instructions for the specific topics that we cover, we assume that you will do your own additional research regarding additional customization and other plugins you may require.
Note that we encountered a few problems with some of the presented plugins during our tests in certain server configurations.
At the time of this writing, WordPress 2.0 is the current generally available release. WordPress 2.1 is on the way (in beta), and certain of these directions and plugins will be obsolete or in error for version 2.1. Please visit http://www.seoegghead.com/seo-with-php-updates.html for information regarding updated procedures for WordPress 2.1.
In this chapter, you learn how to:
Install WordPress 2.0
Turn on permalinks
Prevent comment spam with the Akismet plugin
Add social bookmarking icons with the Sociable plugin
Implement “Email a friend” functionality with the WP-Email plugin
Add “chicklets” with the Chicklet Creator plugin
Generate a traditional sitemap with the Sitemap Generator plugin
Generate a Google sitemap with the Google Sitemap plugin
Create a Digg button plugin
Create a “Pagerfix” plugin, to add links to individual pages in the blog’s pagination
Add a robots.txt file to your blog and exclude some content that should not be indexed
Make the blog your home page and redirect /blog to / (if desired)
/blog
/
Much of these are optional, but you’ll want to implement at least some of them. This chapter tackles them one by one.
This appendix examines some basic aspects of constructing regular expressions. One reason for working through the simple... more
This appendix examines some basic aspects of constructing regular expressions. One reason for working through the simple regular expressions presented in this chapter is to illuminate the regular expressions used in Chapter 3 and further extend your knowledge of regular expressions.
The following exercises use OpenOffice.org Writera free document editor that makes it easy to apply regular expressions to text, and verify that they do what you expected. You can download this tool from http://www.openoffice.org.
http://www.openoffice.org
This appendix has been “borrowed” from Beginning Regular Expressions (Wiley Publishing, Inc., 2005) by Andrew Watt. We recommend this book for further (and more comprehensive) reference into the world of regular expressions.
The examples used are necessarily simple, but by using regular expressions to match fairly simple text patterns, you should become increasingly familiar and comfortable with the use of foundational regular expression constructs that can be used to form part of more complex regular expressions. Other chapters explore additional regular expression constructs and address progressively more complex problems.
One of the issues this chapter explores in some detail is the situation where you want to match occurrences of characters other than those characters simply occurring once.
This chapter looks at the following:
How to match single characters
How to match optional characters
How to match characters that can occur an unbounded number of times, whether the characters of interest are optional or required
How to match characters that can occur a specified number of times
First, let’s look at the simplest situation: matching single characters.
Purchase Before purchasing this product, please be sure you have met all software and system requirements, and that you understand any limits placed upon its use.
Return Policy Wrox Chapters on Demand are non-returnable and non-refundable.
Reader Software Wrox Chapters on Demand are offered as PDFs, and they must be viewed using the Adobe Reader. If you do not have the Reader installed, it can be downloaded for free at Adobe.com.
Test Download As Wrox Chapters on Demand purchases are non-returnable, it is advisable that you test your system and software configurations with a free sample download before you place an order.
Usage Rights for a Wrox Chapter on Demand File Any Wrox Chapter on Demand product you purchase from this site will come with certain restrictions that allow Wiley to protect the copyrights of its products. After you purchase and download this title, you:
If you have any questions about these restrictions, you may contact Customer Care at (877) 762-2974 (8 a.m. - 5 p.m. EST, Monday - Friday). If you have any issues related to Technical Support, please contact us at 800-762-2974 (United States only) or 317-572-3994 (International) 8 a.m. - 8 p.m. EST, Monday - Friday).
Related Books