Category: Office News

The Google Sandbox

You may have heard people speak of the Google Sandbox as a dreadful place to be avoided. But what is it, exactly, and how can you avoid going there?

Google’s goal is to return the best possible search results. They have a number of rules for webmasters that basically boil down to: don’t screw with the search. For example, Google will penalize your site if they catch you buying or selling paid links.

Suppose you have a brand new site, just registered within the last few months, and it has thousands of links to it. What’s the most likely explanation – that this site is so incredible all these people have found out about it already, or that someone is trying to manipulate the search engine results? Google will assume the second, and they’ll put the site in the sandbox, which generally means it’s not going to show up when someone does a search for the targeted keywords. Considering Google’s share of the search market, this is a Bad Thing.

While the sandbox can be triggered automatically, Google employs people who actually go out and look at sites. If you have a new site that’s been growing so quickly that it got put in the sandbox, but not so quickly that it couldn’t just be the greatest site ever built, eventually someone from Google will come out and see if the site really does have the content the links say it has. For example, if One Ear Productions was a new site and suddenly had thousands of links talking about our awesome web design, someone might come check that the site really is talking about web design and doesn’t just have some crappy computer-generated articles and a bunch of advertisements. Google isn’t going to decide whether this really is the greatest web design company ever or whether we really are offering extremely informative articles on web design;  they just want to know that this is a legitimate site and we’re not trying to jerk them around.

Cliff notes? Publish high-quality content. Don’t buy or sell links, don’t give Google a reason to think you’re doing that, and don’t put a link to your brand new site in the footer of your 10,000 page site. Otherwise, you may find yourself sitting in the sandbox all alone..

Website Usability, Part 1: Search

One issue that plagues both software and web development is adding in features that sound cool, but confuse the user. What makes perfect sense to the designer often doesn’t make quite as much sense to someone using the software or website for the first time! Let’s talk about a few things that contribute to usable websites. We’ll start with one of the simplest: search.

 

Placing the Search Box

These days, users expect to see a search box; this is particularly true for ecommerce sites. Don’t make them hunt for it! It should be located near the top of the page, easy to find, and obvious (users should know without thinking about it that this is your search box). This doesn’t just apply to the homepage, either; every page of the site should have the search box, in the same place. Additionally, the default should be to search the entire site, not just the section the user is in; you may opt to provide a checkbox to restrict the search to the current section. As with any rule, there are exceptions; you might, for example, choose to restrict a blog search to the contents of that blog, as we’ve done here.

Handling Search Queries

Users generally expect to see the same results regardless of how they capitalize the query; your search engine should be programmed to ignore case unless there is a good reason to do otherwise, in which case (no pun intended) you can make case-sensitivity a selectable option.

Users often don’t know exactly what term to search for; it is helpful if your search engine understands related terms. For example, a user who searches for “food” might be interested in results for “cooking”, even if the word food never actually appears on the cooking page. One option is to show users a list of related terms that they can search on, and to attempt to correct misspellings. Consider Google: when typing, it offers suggestions for what term you might want, and if you search for what it believes is a misspelled word, it offers the correct spelling.

Design for the Average User

Remember that most users won’t search for long-tail keywords or use boolean queries; they’ll type 2-3 words into the search box and expect to get reasonably relevant results. A good search engine will make it easy for the users to define exactly what they’re searching for, even when the users can’t find (or spell) the correct search terms.

Feel free to provide advanced search options for power users, but don’t let them clutter up the basic search and confuse your new users. As in many things, strive for simplicity; your users will appreciate it.

HTML5, Part V: Forms

When it comes to forms, one thing I hear a lot about is setting them up using PHP; HTML has basic form capability, but don’t you want them to do more?

With HTML5, forms can indeed do more, without the need for any fancy scripting. The best part is, all of the new  features degrade gracefully, which means you can use them now without spending a lot of time worrying about what they’ll do in legacy browsers.

Let’s start with the basic elements of an HTML form, which are the same in HTML 4 and 5:

<form>

<input name=”name”  type=”type” value=”value”>

</form>

That’s all there is to it! Let’s take a look at each of these elements. The name tag identifies the input, which lets you refer to it later; this is particularly important when you’re using it to pass information to the next page. The type tag tells the browser what type of input to use, such as a button or dropdown. Finally, the value field is the default value; for a button, this would be the displayed text. Each of these tags is optional.

The nice thing is that when a browser doesn’t understand the given type, or the type is not given, it simply displays it as a text field rather than giving unpredictable behavior. This means we can use new HTML5 input types, knowing that people with older browsers will still have a way to provide input.

How about an example? Suppose I want you to choose an even number between 8 and 22, with a default of 12. I can do that using the following code:

<form> <input type=”number” min=”8″ max=”22″ step=”2″ value=”12″> </form>

If you’re using an HTML5-compliant browser such as Opera, the following should show up as a spinbox that allows only the even numbers between 8 and 18. In older browsers, it will show as a simple text field, but will still be validated – the form will refuse to submit if the user enters an illegal value. The second field is the same code with the type changed to range.

 

So why use these special tags instead of just having the user type a number? For one thing, it can be optimized in various ways. If you’re viewing this page on an iPhone, you won’t get a spinbox, but your keyboard will default to numbers. A search field (another new type) may be functionally the same as a text box, but if you’re using Safari on a mac, it’ll have rounded corners to look like the standard mac search boxes; on both Safari and Chrome, once you start typing a little x will appear to erase the field. While browsers don’t handle these tags in the same way, telling them what type of data is expected allows the more appropriate data entry.

Aside from new types of input, you can also do new things with the old types. For example, one thing that drives me crazy is scripts that autofocus on an input box; when I check my email on the web I tend to have finished typing my username and be halfway through my password when the page finishes loading and moves the focus back to the username field. Now, instead of using javascript, we can stick with HTML and accomplish the same thing in a less annoying (and more consistant) way. Consider the following code:

<form> <input name=”tellme” autofocus  placeholder=”Placeholder Text”> </form>

Depending on which browser you’re using, both, one, or none of the new attributes we’re using will take effect; whichever ones don’t will simply be ignored. Autofocus (predictably) sets the focus to this field, while placeholder text appears in light grey to tell you what you’re supposed to type:

Again, the nice thing is that this degrades gracefully, so it provides a better user experience for people with compatible browsers without annoying those people who don’t see it. (Of course, you may want to detect compatibility and use javascript to provide the same functionality so that more people can see your site as designed). Note that in the above example, we used autofocus and placeholder text together, which is kind of silly because putting focus on that window removes the placeholder text, but it will come back if you click on something else and remind you what needs to be typed there! Also, the autofocus brings this box into view as soon as you load the page, which means you have to scroll back up to get to the top of the article; not particularly good design, but it does demonstrate the power of this attribute.

HTML 5, Part IV: Canvas

One of the hot new elements in HTML5, canvas allows you to draw graphics, generally using javascript. Remember that the canvas tag is not supported by IE8, so you’ll need to provide fallback content, like we did in the HTML5 video section, for IE users.

<canvas id=”id” width=”x” height=”width”> </canvas>

The width and height attributes, shown above, are both optional; unless specified they default to 300 pixels wide by 150 pixels high. You can set these using DOM properties or resize the element using CSS.

The canvas tag has not been implemented consistently between browsers. Notice that this isn’t a self-closing tag; text between the two tags will be ignored, so you would put replacement content there for browsers that do not support the canvas tag; leaving that space blank means that older browsers simply won’t display anything. Safari does not actually require the canvas end tag, but Mozilla does; thus the above style works for both as Safari simply ignores the </canvas>.

The WHATWG page has full documentation for this tag; unlike most other HTML, it requires some scripting to be useful. Rather than fully explore the tag, I thought I’d link to a few interesting sites that use it.

The HTML5 Canvas and Audio Experiment puts on an interesting light show (with sound, if your browser supports the <audio> tag). Notice the fallback text inside the audio tag in the page source, displaying an error message for browsers that don’t support it.

ball game – click the set of two or more balls of the same color to remove them. Can you get them all?

You can even use canvas to play a simplified game of Super Mario Kart!

SEO, Part V: Pagerank Doesn’t Matter

Wait, how can I tell you that pagerank doesn’t matter? Wasn’t I just talking about the importance of getting links from high-ranking sites? Didn’t I say that pagerank is a measure of how important your site is?

The above is all true. You do need to get links from high-ranking sites, and Google does use PR as a measure of usefulness. What it doesn’t tell you, however, is how relevant the page is. Let me explain.

Suppose you do a search for “how to shampoo the dog”. What would be more useful to you: a PR0 site with detailed dog-washing instructions, or a PR6 site selling dog shampoo? Obviously the former is a lot more useful to you, because it’s relevant to your query.

 

Similarly, while high pagerank is nice to have, your users don’t really care about it; they just want your site to give them what they need. Similarly, you just want your site to come up at the top of the SERPs (Search Engine Results Pages) so your customers can find you. Thus, you want to get links from high-ranking, relevant sites because they count as a strong vote that your page is useful.. provided they’re done right.

Next question: suppose you’re the owner of that webpage on how to shampoo a dog. Which would be more useful to you: a link from a PR2 page using the anchor text (that’s the words you click on) “how to shampoo your dog” or one from a PR6 page with the anchor text “click here”? Again, the former is more helpful; while the latter is better for increasing your pagerank, the anchor text only helps you to rank well for the term “click here” (which Adobe dominates), while the former tells Google what your site is about and helps them to return relevant results.

So do you care about getting a high PR on your sites? Well, yes – more pagerank is always preferable to less pagerank, and having a high-ranking site also speaks to the usability of new pages on that site that don’t yet have their own backlinks. Just remember: relevance trumps numbers!

In fact, Google has explicitly said that the PageRank system is one over more than 200 signals they use to index and rank pages. Remember, what they’re trying to do is find relevant, high-quality content; everything else is just a means to that end.

SEO, Part IV: More on Links

In SEO Part II: Links, we discussed the importance of incoming links from high-ranking sites in the same area as yours. What other factors influence the value of incoming links?

Aside from relative popularity, some sites are considered by Google to be particularly authoritative; Wikipedia is one example, but there are others. Pages on Wikipedia can outrank pages on other sites that have a lot more useful information, due to the site’s importance. A vote from these “expert sites” is thus worth more than most. You can read more about this by Googling ‘Hilltop Algorithm’.

Other good sites to get backlinks from are those with .edu or .gov TLDs. While these pages don’t actually get a pagerank boost (at least, so says Google), they do tend to be considered more authoritative, so backlinks from them increase Google’s estimation of your site’s trustworthiness.

The value of links from web directories varies; generally you want to get links from human-created directories that are well organized and link to other good sites, while avoiding those that are automatically created. Of course, you’ll also want to check that the links aren’t using the nofollow attribute, so they pass along link juice.

As previously mentioned, it’s helpful to be one of just a few links rather than one of thousands, so as to get the largest possible share of link juice. Also, remember that Google considers content closer to the top of the page to be more important; thus, it’s good if your link shows up close to the beginning of the HTML file (which is not the same as showing up at the top of the page when displayed – see the note on CSS in that last link!)

It can be discouraging to go to a lot of effort getting new links without seeing a corresponding increase in your pagerank, but remember that, like sites, links need to “age”; a backlink that’s been around for a while is worth a lot more than a new link that could be gone tomorrow. Slowly building links over time tends to lead to a gradual increase in the pagerank of your site.  But be sure to read part five about when pagerank doesn’t matter!

SEO, Part III: What Comes First?

When Google indexes a page on your site, it assumes that whatever is at the top of the page is most important. So what’s at the top of the page? Probably your navigation bar!

Obviously, we’re more interested in having Google index the page content than the page navigation, but on the other hand, the navigation bar is important for users. How do we resolve this?

Remember when I said that your site generally shouldn’t look different to users and search engines? One exception is when you’re displaying the same content, but they read that content differently. Enter CSS!

Suppose your content is first in the html file, followed by your navigation, but your CSS style forces the navigation bar to be rendered at the top of the page. Your users see the page as you intended; Google, however, ignores the style and parses the page in the order it appears in your HTML file. Everybody wins!

To make this work, your HTML body will look something like this:

<body>
<div id=”content”>Your important stuff goes here</div>
<div id=”navigation”>Your nav bar goes here</div>
</body>

Meanwhile, your CSS style will define where each div goes in absolute terms:

#navigation {
position: absolute;
top:  10px;
<!– other alignment code –>
}

#content {
position: absolute;
top: (10+size of nav bar)px;
<!– other alignment code –>
}

Congratulations! You now have the layout you want, while still putting your most important content where Google wants to see it!

Next time: more on keywords.

SEO, Part II: Links

So now that we’ve mentioned a few of the things you shouldn’t do, what are some search-engine-approved ways of driving traffic to your site? We’ll focus on Google, since they have the majority of the search market in western countries, but most of what we’ll cover applies to other search engines as well. Today we’ll talk about links and their effect on how Google views your website.

Everyone knows that incoming links are key, but what people don’t always realize is that the quality of those links are also important. Let’s take a quick look at the Google pagerank algorithm.

Every web page indexed by Google has an ever-changing rank between 0 and 10 that indicates how important, useful, and trustworthy it is; of the top 100 US sites according to Alexa, the average pagerank is about 7.5, and with the exception of the adult sites, all have PR at least 6; on the entire internet, there are fewer than two dozen sites that have a PR of 10. (By the site’s PR, we mean the PR of the home page; the PR10 sites generally have multiple pages with that rank). Pagerank uses a logarithmic scale, so as the value decreases, the number of websites with that value increases exponentially. Note that the integer pagerank is only an approximation and can change whenever Google does a PR update, which generally happens at least once a quarter.

Google considers each link from a webpage (except those with the nofollow attribute) to be a “vote” for its destination; the total PR of the page is divided by the number of dofollow links. For example, if a PR6 page has 30 dofollow links, that page contributes a 0.2 “vote’ to each page it links to.

People often try to abuse this system by buying hundreds or thousands of links, a practice frowned upon by Google; their official policy is that sponsored links should use the nofollow attribute. “Organic” links, however, can be very valuable. Let’s look at what does and doesn’t work.

Obviously, links from PR0 sites, particularly those that have a lot of outgoing links, aren’t worth much; ideally you want to have links from higher PR (4 or above) sites that don’t have many other links on the same page. Additionally, links are worth more if they come from a related site; if your site is about X and you get a link from a highly-rated site that is also about X, that link is worth more because Google considers the other site to be an authority on the subject. This can be a problem for ecommerce sites because related sites are likely to be your competitors and their owners won’t be interested in helping to increase your pagerank!

One thing Google watches for is the speed at which links are acquired; a site that goes from 8 incoming links to 20,000 incoming links overnight probably is paying for those links, so they’ll be discounted. Again, Google wants to see “organic” links – those that come about naturally because your site is useful to users of the referring site – and tries to reward them. Google also looks at the “neighborhood” your links are coming from; links from spammy sites tend to be discounted.

Another thing to watch for is the anchor text used for your link; a link of the form <a href=”yoursite”>relevant text</a> is worth a lot more than one that looks like <a href=”yoursite”>click here</a> (although Google does have algorithms in place to try to catch Googlebombs). Of course, if every incoming link to your site uses the same anchor text, that’s another sign Google can use to catch paid links; it’s also a wasted opportunity since you want to rank highly for multiple ways to phrase the same thing. Notice that webpages can even rank highly for terms that don’t appear on them (as in the George Bush / miserable failure Googlebomb) if enough links use that phrase; for example, searching Google for the phrase “click here” will bring up the page to download Adobe Reader, due to the many, many sites that tell their users to “click here” to download it.

More recently, Google also started looking at a site’s outgoing links; a site that mostly links to trustworthy sites such as wikipedia is likely to be regarded more highly than one that links to spammy sites. In other words, Google judges you by the company you keep!

Tomorrow: how to code your webpages to make Google happy.

HTML5, Part I: Video

I had originally planned to finish off my Web 2.0 Overview before getting into the specifics of each technology, but yesterday’s introduction to HTML5 got a lot of attention, so we’ll move it up a bit.

HTML5 supersedes HTML 4.0, XHTML 1.0, and XHTML 1.1; it provides new tags for handling many common web design elements that are currently handled with third party applications such as Flash, and standardizes elements that have never been formally documented.

What do you need to get started?
The latest versions of Chrome, Firefox, Opera, and Safari all support some but not all HTML5 features; Internet Explorer 8 has a few features and IE 9 is expected to add more. As always, when designing for Internet Explorer, be particularly careful about elements that compete with Microsoft’s own offerings – for example, IE 8 does not support the canvas tag, which would replace Microsoft’s Silverlight technology. In this series, I’ll attempt to focus on those elements that are supported by all of the major browsers, or at least mention the exceptions.

Show me the video!
Let’s start with how to embed video in a webpage. Traditionally this has been done with a third-party plugin such as QuickTime, RealPlayer, or Flash; now we have a standards-based method that should be supported on all platforms. Currently the <video> tag works in Chrome, Firefox 3.5, IE 9, Opera, and Safari 3; however, each browser supports different codecs and containers. Unfortunately, there is no one container/codec combo that works for all of these browsers, so if you want your video to show up for everyone using an HTML5 browser, you’ll need to encode it multiple times. Fortunately, HTML5 allows you to specify several videos in one video tag, leaving it to the browser to choose which one to display. To have the video display on all browsers, you’ll need the following encodings: (Converting video into the correct formats is one service a professional website designer will provide)

  • Theora video and Vorbis audio in an Ogg container
  • H.265 video and AAC audio in an MP4 container
  • VP8 video and Vorbis audio in a WebM container

Finally, to accommodate browsers that do not support the <video> tag, you’ll want to fall back to a Flash video.

So give me the tag details, already!
When you only have one video file, the tag works pretty much as expected; however, it has a number of optional (but desirable) attributes:

<video src=”videoname” width=”XXX” height=”YYY” controls> </video>

So what’s all this? The src part, width, and height should be fairly self-explanatory;  width and height are optional but should be used anyway. By default, the video tag does not offer the user any type of control over the video; the controls option adds a built-in set. Alternatively, you can write your own; the video element has methods play() and pause() and read/write properties volume, muted, and currentTime.

If you expect every visitor to watch the video,  the preload option will start it downloading as soon as the page loads; setting preload=”none” will tell the browser not to load the video until it’s requested. Finally, the autoplay option does exactly what you would expect; it downloads the video and starts it playing as soon as possible.

Several videos, one tag
What if you went ahead and made three encodings of your video so everyone using an HTML5 browser can see it? In that case, your code will look something like this:

<video width=”xxx” height=”yyy” controls>

<source src=”video.mp4″ type=’video/mp4; codecs=”avc1.42E01E, mp4a.40.2″‘>

<source src=”video.webm” type=’video/webm; codecs=”vp8, vorbis”‘>

<source src=”video.ogv” type=’video/ogg; codecs=”theora, vorbis”‘>

<object width=”XXX” height=”YYY” type=”application/x-shockwave-flash” data=”__FLASH__.SWF”>

<param name=”movie” value=”__FLASH__.SWF” />

<param name=”flashvars” value=”image=__POSTER__.JPG&amp;file=__VIDEO__.MP4″ />

</object>

</video>

Do you really need all that? Well, no…but if you don’t specify exactly how each video is encoded, the browser will find out whether it can play a video by downloading it and trying to play it, wasting your bandwidth and the user’s time. Better to put in a little extra effort beforehand! The object tags will be ignored by HTML5 browsers (as will any non-source tags inside the video tag) but will cause older browsers to display the flash movie.

Side note: the MP4 file format should be listed first, as the iPad only notices the first source listed and will fail to play your video otherwise.

Acknowledgements:
Video for Everybody offers a more in-depth discussion of setting up your videos to be readable by everyone,

HTML5: Up and Running (due out next month from O’Reilly) contains a fascinating discussion of the history of HTML and instructions on encoding video.

Web 2.0, Part I: Overview

You’ve probably heard the term web 2.0, which refers to more interactive websites (although no single definition exists).  Rather than just browsing static webpages, users interact with and even change them; this is good for the site owner, because not only are interactive websites more likely to keep users coming back, users can actually add value to the site.

A number of tools have been developed that are particularly useful for building web 2.0 sites; today we’ll take a brief look at a few of them.

AJAX, which stands for Asynchronous Javascript and XML, allows for faster user interaction.  Simple processing is done on the client side in javascript without having to wait for the data to be routed to the server and back; in the meantime, data that does need to be transferred is formatted with XML. AJAX allows you to do things like updating parts of a page as new information becomes available, without ever reloading the entire page.

Flash, of course, is still very popular in spite of not being supposed on the iPad.  Many flash applications are now written using Adobe Flex.  Flash is frequently used to add animation, video, and interactivity (particularly advertisements and games) to websites.

Much of the functionality of flash is now being provided in HTML5; while it is still technically in draft status, major browsers are already being updated to be compatible with the new standards.  HTML5 has the advantage over other technologies in that it should (soon) be supported on all systems, whereas Flash and Javascript pages will not be available to all users.

Blogs, of course, are a big part of the more interactive web; we covered WordPress last week, but there are a number of other options as well. Social networking (e.g., Facebook) is huge, and users also want new ways to consume your content (e.g., RSS). PHP has long been used for coding interactive sites, particularly in combination with MySQL databases. CSS, of course, allows you to customize the look of your content without having to do any extra work on the content itself; see the Zen Garden for a nice example of what CSS can do.

In the future, we’ll be providing a more in-depth look at each of these technologies, as well as talking a bit about when you might want to use them.