May 11, 2006 Meeting Minutes
Topic: Search Engine Optimization
Facilitator: Sheila Campbell, USA.gov
Guest speaker: Bob Keating, GSA
Attendees: Approximately 80 people on the call
News
Sheila Campbell mentioned that this semester’s Web Manager University is going very well. Space is still available for all classes, but register soon since classes are filling up. Go to Webcontent.gov for all the details
Search Engine Optimization
Our guest speaker was Bob Keating, Program Manager for the U.S. government’s official search engine, USA.gov Search. Bob gave a presentation about search engine optimization and talked about the latest methods for getting found on major search crawlers such as Google, Yahoo, MSN.
You may download Bob’s presentation.
Search Visibility Factors
- Keywords: Title and body tags are most important places to put key words (words your visitors are searching for).
- URLs: Putting keywords in URLs is less important because search engines don’t use them as a major factor in giving you better rankings. That’s because, in the past, webmasters used to try to “game” the system by putting long and multiple keywords in URLs.
- Myths about Metadata: Search relevancy is based on the content on a page and link popularity – not about what the page author has to say about the content. So metadata is not the key to high rankings on commercial search engines. Applying metadata can be very time consuming so you should only do it if it’s important for your agency (it can be useful for your own agency’s search engine and for tracking and managing content internally). Concentrate on using targeted key words on each content page. Bob clarified that when he’s talking about “keywords” he means key words on an individual content page (in the body of the text) – not the “keyword” metadata tag.
- Keywords in Links: Use keywords in links to improve the relevancy of the page to which you’re linking. You need to give text as much prominence as possible for the page you’re linking to.
- Popularity: this is a key factor in getting ranked in search engines. Popularity is based on the number of pages on the Web that link to you. It’s not based on any customer satisfaction ratings (like ACSI), or how “good” your site is. Search engines use different algorithms for measuring popularity. Google’s is called “Page Rank.”
- Determining Your Popularity: On Google, you can check the total number of links to your site, which can give you an idea of your relative popularity. To check, type “link: your URL” (you need a space after the colon). However, the total number of sites that link to you is really only a sample. Google doesn’t provide the actual number since this is proprietary information and could give away information about how it determines Page Rank.
- Directories: To improve search engines’ ability to crawl your site, it’s a good idea to get listed on major Web directories. Search engines tend to start their searches there, so you want to be listed there. Bob mentioned an experiment he did with GovLoans.gov, which, at the time, was listed on the second page of search results on Google. After he listed it in a major search directory, it went to the #1 listing on Google.
- DMOZ Directory (Open Directory Project): This is one of the most important directories. The best way to get listed (or know if you’re already listed) is to submit your site to the directory and then contact the editor of the category (or categories) to which you submitted your site. If you can’t find an editor in a particular category, go to the editor at the next level up in the category hierarchy. The web address is: www.DMOZ.org
- Outbound Links: Your popularity isn’t determined by the sites you link to, so increasing the number of “outbound links” on your site won’t necessarily improve your search rankings. The main benefit of outbound links is that it does help visibility and rankings of the sites you link to. However, there is debate and controversy within the search community on this subject. Some argue that outbound links can improve your rankings because search engines may identify you as a “hub.” For example, it’s possible that search engines might consider you as an important site if you link to other good sites. But no one knows for sure.
- Robots Exclusion: this is used to exclude content you don’t want people to see or that you don’t want indexed (for example, some scripts, forms, images, or pages under construction). But you need to specifically tell search crawlers which areas to stay out of. The downside is that not all search engines pay attention to Robots files. It’s more of a gentleman’s agreement. MSN and Yahoo generally do, but Google sometimes doesn’t.
- Case Study: Bob used an example from the NOAA Fisheries website to show how keyword prominence and density can affect your rankings. One problem they encountered was in using a “flyout” menu” (mouseover) for their sub-navigation. This kind of navigation makes it very difficult for search engine crawlers. They also found that they needed a site map as a link from every page so the engine can more easily and efficiently crawl their site.
Basics of Search Engine Friendly Design
- Types of pages: you might have different types of pages and each one might be optimized differently. For example, an e-commerce page might be optimized for search differently than an informational page. An e-commerce page/site might want to submit their content to Froogle, which is Google’s shopping search engine.
- Choosing Keywords: When writing a web page and choosing keywords, use the language of your audience. Don’t use internal agency lingo—the government speaks a whole different language. Two tools to help you select the best keywords are Overture and Google Ad words. They won’t tell you the exact number of searches for a particular word. They’re tools that are geared toward marketing people who want to know which words to bid on when purchasing words on commercial search engines. If advertisers are willing to pay a certain amount for a particular keyword, that word is probably more popular than other words. So it’s a good way to gauge the popularity of a particular keyword over a similar one. For example, the tools will help you see that “real estate” is bid on more often than “real property” and that “hurricane relief” is more commonly used than “hurricane recovery.”
- Page Content: When writing page content, keep it focused and topic-oriented. Put keywords in title tags, etc. Make sure that any internal links you have to a particular page use the same key words that are in the body of the text of the destination page.
- Java Script: Body text should be visible to the user and accessible to the crawler. Users shouldn’t have to use java script to see the body of text. Crawlers have a hard time viewing text in java script.
- Primary vs. Secondary Text: Distinguish between secondary and primary text. Search engines care most about primary (title tags, body text, etc.) than secondary (description tags, URL, etc.).
- Syndicated Content: Using syndicated content is a great way to pull content into your site. For example, Export.gov shares data with USDA and they each publish data on their webpages, but for different audiences. This type of approach can increase the visibility of your site and increases inbound links. Using RSS feeds is one great way to do this. Take press releases and syndicate them.
Topix.net is a news “bot” that goes around government websites and pulls government news feeds. Taking advantage of this service could help bring more traffic to your website and increase inbound links. There are other News aggregators like Google News. FYI, USA.gov (formerly FirstGov.gov) will soon be taking RSS feeds from across government (including audio and video/podcasts) and integrating them into a new “News” search.
Site Architecture
- Text Links: These are very search-engine friendly. But the downside is that they can skew keyword density if overused on a page since crawlers tend to read text links before body text. For example, you could skew your keyword density if you use the same navigation on every page, because crawlers see those links first. One solution to this problem is to have two navigation schemes if you’re using java script -- one for crawlers and one for users.
- Site Maps: this won’t increase your popularity, but it will enhance crawlers’ ability to access and crawl all the relevant pages on your site. It helps analyze your site’s information architecture. There’s no particular recommended format for a good site map. The most important thing is that it has to enable a crawler to get to your important sub-pages. And make sure you use the same keyword on your site map as the page it links to.
- Google Site Maps: this is an additional tool to enable and assist the Google crawler in better crawling your site. You submit your URL to Google, and they will look at your navigation structure to help crawl your site better.
Design Considerations
- Redesigns: People often redesign and then worry about search afterwards. This is a mistake. If you undertake a redesign or other major change, make sure your redesign team has SEO experience. You don’t want to add too many bells and whistles that will affect SEO. You can lose page rank (popularity) as a result of a poor redesign and poor planning – which can take months to get back.
- Flash: This can hurt search visibility and is problematic. If you must use it, make sure you have a “skip link” option so crawlers can index a real page. If you use Flash, it’s important to include title and description meta tags. It’s OK to do some Flash as long as crawlers have the relevant text to link to.
- JavaScript: Crawlers don’t follow text embedded in Java script. If you have java script within a page, it can decrease download time for crawlers
- Download times: This can affect how search engines crawl your site. How long is too long? It varies. The “patience” of crawlers is different.
- Frames: Don’t do frames if you can avoid them. Frames don’t provide keywords in any context for crawlers to index. See Bob’s presentation for some possible work-arounds.
- Dynamic pages: As these have become more popular, search engines are becoming more sophisticated in learning to crawl them. But there are still considerable issues. The problem is that content is coming from a database, so the crawler is unsure of unique pages. The more parameters you have, the harder time crawlers will have differentiating among pages. The best solution is to create static HTML pages (some web content management systems will allow you to serve up static HTML pages to a crawler). You can also modify URLs so crawlers will understand what they’re looking for (Apache can do some of this).
- Session ID: Crawlers will ignore web pages with session IDs, so avoid them. There’s a way to omit them from search engines, but this is considered “cloaking,” a technique that involves serving one page to a crawler and a different one to users. But this is the “shady” site of Web that is most typically practiced by online gambling sites, adult web sites, and others. There are more ethical solutions to the issue of session IDs.
- PDFs: same rules for keywords and phrases -- those words are most important in headlines, etc. Because these documents still pose problems (including long download times), it’s best to put documents in HTML rather than PDF whenever possible. Or create an intermediate HTML page that has an abstract of your PDF. Documents greater than 100K cause the greatest problems for search crawlers -- keep under that size.
Managing Page Rank
- Linking Strategy: You should have a link strategy / policy since who links to you is a critical component and affects your page rank. It’s like someone’s giving you a vote. When you link to another page, you’re giving it a “vote.” [Note that “Page Rank” is Google’s particular popularity rating; other engines call their ratings something different. But “page rank” has, by defaut, become the generic term that most people use.]
- Passing / Leaking Page Rank (PR): You can pass PR to other pages on your site. But it’s not a one for one deal. For example, if you have 3 links on a page to 3 subpages, that usually means that a page rank of 5 doesn’t go to all those pages. You have one vote and you have to split it 3 ways.
There’s a big debate now about whether you can “leak page rank” out of your site. An example would be if you link to another agency, would you be “leaking” your possible points to them? Some say yes, some no. But the practical approach is to say (for now) that both are correct. One view is that if it were true that any site that linked to another would be penalized and have PR “leaked,” no one would link to each other.
- Internal Links: If you’re linking to an internal page, is it better to deep link or to take users to a “portal” page that might be more popular? No, not really. In this case, you’re helping your specific page but not taking PR away from a more popular page.
- Getting Links to Your Site: The bottom line is that SEO is all about the sites that link to you. So you need to have a good marketing strategy. Linking to sites with low PR won’t affect you. It’s easy to get obsessed about SEO since there’s still a lot of mystery. But if you write good content, that will have the greatest positive impact. Don’t lose the real purpose, which is to serve your customers, get people to use your site, and help them find what they’re looking for.
Resources
- Get involved in industry forums and read current books. Many of the resources listed in Bob’s presentation are topics that are still relevant. Even though technology changes, some things will always be true of search engines.
Next Forum call
Time: Thursday, June 22, 2006, at 11 am EDT
Topic: TBD
(NOTE: We usually have the conference calls on the third Thursday of every month, but we moved it to the fourth Thursday to avoid a conflict with a web manager conference that week).
Last Updated: January 12, 2007
