An Interview on SEOBook

September 8th, 2010 randfish Posted in Uncategorized No Comments »

Posted by randfish

Just a short post tonight.

First, off, I'm honored to be interviewed by Aaron Wall. We've had our differences and maintain some divergent opinions on a few topics, but we both have an insane passion for helping make SEO professionals better at their job and work hard to grow the credibility of SEO as a whole.

SEOBook Interview

Second - we've got a lot of reason to be thankful. SEOmoz was recently named the 334th fastest growing company in the US by Inc Magazine. I was named to Seattle's 40 Under 40 List (I'm guessing it's a typo) and we've recently passed 6,000 PRO subscribers (actually, we're up over 6,300 as of today).

SEOmoz's Jen Lopez as Wonder Woman

As amazing as all that is, nearly everyone at SEOmoz is thinking not about these milestones, but about one of our own - Jen Lopez - who noted on her Twitter feed that she's out battling cancer. We are all with you Jen - every last one of us, with all our hearts. And we agree: #fuckcancer


Do you like this post? Yes No

AddThis Social Bookmark Button

Latent Dirichlet Allocation (LDA) and Google’s Rankings are Remarkably Well Correlated

September 6th, 2010 randfish Posted in Uncategorized No Comments »

Posted by randfish

Last week at our annual mozinar, Ben Hendrickson gave a talk on a unique methodology for improving SEO. The reception was overwhelming - I've never previously been part of a professional event where thunderous applause broke out not once but multiple times in the midst of a speaker's remarks.

Ben Hendrickson of SEOmoz speaking at the London Distilled/SEOmoz PRO Training
_
Ben Hendrickson speaking in last Fall at the Distilled/SEOmoz PRO Training London
(he'll be returning this year)

_

I doubt I can recreate the energy and excitement of the 320-person filled room that day, but my goal in this post is to help explain the concepts of topic modeling, vector space models as they relate to information retrieval and the work we've done on LDA (Latent Dirichlet Allocation). I'll also try to explain the relationship and potential applications to the practice of SEO.

A Request: Curiously, prior to the release of this post and our research publicly, there have been a number of negative remarks and criticisms from several folks in the search community suggesting that LDA (or topic modeling in general) is definitively not used by the search engines. We think there's a lot of evidence to suggest engines do use these, but we'd be excited to see contradicting evidence presented. If you have such work, please do publish!

The Search Rankings Pie Chart

Many of us are likely familar with the ranking factors survey SEOmoz conducts every two years (we'll have another one next year and I expect some exciting/interesting differences). Of course, we know that this aggregation of opinion is likely missing out on many factors and may over or under-emphasize the ones it does show.

Here's an illustration I created for a presentation recently to help illustrate the major categories in the overall results:

Illustration of Ranking Factors Survey Data

This suggests that many SEOs don't ascribe much weight to on-page optimization
_

I myself have often felt that from all the metrics, tests and observations of Google's ranking results, the importance of on-page factors like keyword usage or TF*IDF (explained below) is fairly small. Certainly, I've not observed many results, even in low competitive spaces, where one can simply add in a few more repetitions of the keyword, maybe toss in a few synonyms or "related searches" and improve rankings. This experience, which many SEOs I've talked to share, has led me to believe that linking signals are an overwhelming majority of how the engines order results.

But, I love to be wrong.

Some of the work we've been doing around topic modeling, specifically using a process called LDA (Latent Dirichlet Allocation), has shown some surprisingly strong results. This has made me (and I think a lot of the folks who attended Ben's talk last Tuesday) question whether it was simply a naive application of the concept of "relevancy" or "keyword usage" that gave us this biased perspective.

Why Search Engines Need Topic Modeling

Some queries are very simple - a search for "wikipedia" is non-ambiguous, straightforward and can be effectively returned by even a very basic web search engine. Other searches aren't nearly as simple. Let's look at how engines might order two results - a simple problem most of the time that can be somewhat complex depending on the situation.

Query for Batman

Query for Chief Wiggum

Query for Superman

Query for Pianist

For complex queries or when relating large quantities of results with lots of content-related signals, search engines need ways to determine the intent of a particular page. Simply because it mentions the keyword 4 or 5 times in prominent places or even mentions similar phrases/synonyms won't necessarily mean that it's truly relevant to the searcher's query.

Historically, lots of SEOs have put effort into this process, so what we're doing here isn't revolutionary, and topic models, LDA included, have been around for a long time. However, no one in the field, to our knowledge, has made a topic modeling system public or compared its output with Google rankings (to help see how potentially influential these signals might be). The work Ben presented, and the really exciting bit (IMO), is in those numbers.

Term Vector Spaces & Topic Modeling

Term vector spaces, topic modeling and cosine similarity sound like a tough concepts, and when Ben first mentioned them on stage, a lot of the attendees (myself included) felt a bit lost. However, Ben (along with Will Critchlow, whose Cambridge mathematics degree came in handy) helped explain these to me, and I'll do my best to replicate that here:

Simplistic Term Vector Model

In this imaginary example, every word in the English language is related to either "cat" or "dog," the only topics available. To measure whether a word is more related to "dog," we use a vector space model that creates those relationships mathematically. The illustration above does a reasonable job showing our simplistic world. Words like "bigfoot" are perfectly in the middle with no more closeness to "cat" than to "dog." But words like "canine" and "feline" are clearly closer to one that the other and the degree of the angle in the vector model illustrates this (and gives us a number).

BTW - in an LDA vector space model, topics wouldn't have exact label associations like "dog" and "cat" but would instead be things like "the vector around the topic of dogs."

Unfortunately, I can't really visualize beyond this step, as it relies on taking the simple model above and scaling it to thousands or millions of topics, each of which would have its own dimension (and anyone who's tried knows that drawing more than 3 dimensions in a blog post is pretty hard). Using this construct, the model can compute the similarity between any word or groups of words and the topics its created. You can learn more about this from Stanford University's posting of Introduction to Information Retrieval, which has a specific section on Vector Space Models.

Correlation of our LDA Results w/ Google.com Rankings

Over the last 10 months, Ben (with help from other SEOmoz team members) has put together a topic modeling system based on a relatively simple implementation of LDA. While it's certainly challenging to do this work, we doubt we're the first SEO-focused organization to do so, though possibly the first to make it publicly available.

When we first started this research, we didn't know what kind of an input LDA/topic modeling might have on search engines. Thus, on completion, we were pretty excited (maybe even ecstatic) to see the following results:

 

Correlation Between Google.com Rankings and Various Single Metrics
Spearman Correlation of LDA, Linking IPs and TF*IDF

 

(the vertical blue bars indicate standard error in the diagram, which is relatively low thanks to the large sample set)
_

Using the same process we did for our release of Google vs. Bing correlation/ranking data at SMX Advanced (we posted much more detail on the process here), we've shown the Spearman correlations for a set of metrics familiar to most SEOs against some of the LDA results, including:

  • TF*IDF - the classic term weighting formula, TF*IDF measures keyword usage in a more accurate way than a more primitive metric like keyword density. In this case, we just took the TF*IDF score of the page content that appeared in Google's rankings
  • Followed IPs - this is our highest correlated single link-based metric, and shows the number of unique IP addresses hosting a website that contains a followed link to the URL. As we've shown in the past, with metrics like Page Authority (which uses machine learning to build more complex ranking models) we can do even better, but it's valuable in this context to just think and compare raw link numbers.
  • LDA Cosine - this is the score produced from the new LDA labs tool. It measures the cosine similarity of topics between a given page or content block and the topics produced by the query.

The correlation with rankings of the LDA scores are uncanny. Certainly, they're not a perfect correlation, but that shouldn't be expected given the supposed complexity of Google's ranking algorithm and the many factors therein. But, seeing LDA scores show this dramatic result made us seriously question whether there was causation at work here (and we hope to do additional research via our ranking models to attempt to show that impact). Perhaps, good links are more likely to point to pages that are more "relevant" via a topic model or some other aspect of Google's algorithm that we don't yet understand naturally biases towards these.

However, given that many SEO best practices (e.g. keywords in title tags, static URLs and ) have dramatically lower correlations and the same difficulties proving causation, we suspect a lot of SEO professionals will be deeply interested in trying this approach.

The LDA Labs Tool Now Available; Some Recommendations for Testing & Use

We've just recently made the LDA Labs tool available. You can use this to input a word, phrase, chunk of text or an entire page's content (via the URL input box) along with a desired query (the keyword term/phrase you want to rank for) and the tool will give back a score that represents the cosine similarity in a percentage form (100% = perfect, 0% = no relationship).

LDA Topics Tool

When you use the tool, be aware of a few issues:

  • Scores Change Slightly with Each Run
    This is because, like a pollster interviewing 100 voters in a city to get a sense of the local electorate, we check a sample of the topics a content+query combo could fit with (checking every possibility would take an exceptionally long time). You can, therefore, expect the percentage output to flux 1-5% each time you check a page/content block against a query.
  • Scores are for English Only
    Unfortunately, because our topics are built from a corpus of English language documents, we can't currently provide scores for non-English queries.
  • LDA isn't the Whole Picture
    Remember that while the average correlation is in the 0.33 range, we shouldn't expect scores for any given set of search results to go in precisely descending order (a correlation of 1.0 would suggest that behavior).
  • The Tool Currently Runs Against Google.com in the US only
    You should be able to see the same results the tool extracts from by using a personalization-agnostic search string like http://www.google.com/xhtml?q=my+search&pws=0
  • Using Synonyms, "Related Searches" or Wonder Wheel Suggestions May Not Help
    Term vector models are more sophisticated representations of "concepts" and "topics," so while many SEOs have long recommended using synonyms or adding "related searches" as keywords on their pages and others have suggested the importance of "topically relevant content" there haven't been great ways to measure these or show their correlation with rankings. The scores you see from the tool will be based on a much less naive interpretation of the connections between words than these classic approaches.
  • Scores are Relative (20% might not be bad)
    Don't presume that getting a 15% or a 20% is always a terrible result. If the folks ranking in the top 10 all have LDA scores in the 10-20% range, you're likely doing a reasonable job. Some queries simply won't produce results that fit remarkably well with given topics (which could be a weakness of our model or a weirdness about the query itself).
  • Our Topic Models Don't Currently Use Phrases
    Right now, the topics we construct are around single word concepts. We imagine that the search engines have probably gone above and beyond this into topic modeling that leverages multi-word phrases, too, and we hope to get there someday ourselves.
  • Keyword Spamming Might Improve Your LDA Score, But Probably Not Your Rankings
    Like anything else in the SEO world, manipulatively applying the process is probably a terrible idea. Even if this tool worked perfectly to measure keyword relevance and topic modeling in Google, it would be unwise to simply stuff 50 words over and over on your page to get the highest LDA score you could. Quality content that real people actually want to find should be the goal of SEO and Google's almost certainly sophisticated enough to determine the different between junk content that matches topic models and real content that real users will like (even if the tool's scoring can't do that).

If you're trying to do serious SEO analysis and improvement, my suggested methodology is to build a chart something like this:

Analysis of "SEO" SERPs in Google
SERPs analysis of "SEO" in Google.com w/ Linkscape Metrics + LDA (click for larger)

Right now, you can use Keyword Difficulty's export function and then add in some of these metrics manually (though in the future, we're working towards building this type of analysis right into the web app beta).

Once you've got a chart like this, you can get a better sense of what's propping up your competitors rankings - anchor text, domain authority, or maybe something related to topic modeling relevancy (which the LDA tool could help with).

Undoubtedly, Google's More Sophisticated than This

While the correlations are high, and the excitement around the tool both inside SEOmoz and from a lot of our members and community is equally high, this is not us "reversing the algorithm." We may have built a great tool for improving the relevancy of your pages and helping to judge whether topic modeling is another component in the rankings, but it remains to be seen if we can simply improve scores on pages and see them rise in the results.

What's exciting to us isn't that we've found a secret formula (LDA has been written about for years and vector space models have been around for decades), but that we're making a potentially valuable addition to the parts of SEO we've traditionally had little measurement around.

BTW - Thanks to Michael Cottam, who suggested the reference of research work by a number of Googlers on pLDA. There are hundreds of papers from Google and Microsoft (Bing) researchers around LDA-related topics, too, for those interested. Reading through some of these, you can see that major search engines have almost certainly built more advanced models to handle this problem. Our correlation and testing of the tool's usefulness will show whether a naive implementation can still provide value for optimizing pages.

For those who'd like to investigate more, we've made all of our raw data available here (in XLS format, though you'll need a more sophisticated model to do LDA). If you have interest in digging into this, feel free to email Ben at SEOmoz dot org.

How Do I Explain this to the Boss/Client?

The simplest method I've found is to use an analogy like:

If we want to rank well for "the rolling stones" it's probably a really good idea to use words like "Mick Jagger," "Keith Richards," and "tour dates." It's also probably not super smart to use words like "rubies," "emeralds," "gemstones," or the phrase "gathers no moss," as these might confuse search engines (and visitors) as to the topic we're covering.

This tool tries to give a best guess number about how well we're doing on this front vs. other people on the web (or sample blocks of words or content we might want to try). Hopefully, it can help us figure out when we've done something like writing about the Stones but forgetting to mention Keith Richards.

As always, we're looking forward to your feedback and results. We've already had some folks write in to us saying they used the tool to optimize the contents of some pages and seen dramatic rankings boosts. As we know, that might not mean anything about the tool itself or the process, but it certainly has us hoping for great things.

p.s. The next step, obviously, is to produce a tool that can make recommendations on words to add or remove to help improve this score. That's certainly something we're looking into.

p.p.s. We're leaving the Labs LDA tool free for anyone to use for a while, as we'd love to hear what the community thinks of the process and want to get as broad input as possible. Future iterations may be PRO-only.


Do you like this post? Yes No

AddThis Social Bookmark Button

Two Quick, Simple Social Media Tips

September 6th, 2010 RobOusbey Posted in Uncategorized No Comments »

Posted by RobOusbey

Today, I want to share two pieces of advice that are particularly useful to certain types of business - and will be exceptionally quick to implement. I've also created a free download that might help some people implement one of these ideas even more quickly.

About two years ago, I made a recommendation to a client in the UK, and I've just seen it used by a hotel in the USA. If your business offers public computers with internet access - such as those in hotel lobbies, libraries, etc - this is for you:

Tip 1: Put up a sign, next to your public computers, with a call to action; typically this could be something like 'Find us on Facebook' or 'Follow us on Twitter'.

Here's such a poster in use, at the Ledgestone Hotel in Yakima. (Click the image to embiggen.)

Sadly, it doesn't look like the Ledgestone is doing much with their Twitter account; this probably disappoints people who go to their page, and so they don't end up with as many followers as they could do. Remember - getting people to your Twitter page (or Facebook, or whatever else you're asking them to do) is only the first stage - there has to be something there for them when they arrive.

The second tip is more for people who offer wi-fi - this could be all manner of hotels, conference venues, airports, aeroplanes, train stations, coffee shops, etc. For places that offer free wi-fi, this can work even better:

Tip 2: You control the first page visitors see after logging on to your wi-fi. Don't waste this with a dull message; make the page interesting, and put some calls to action on there.

People have probably logged on to do something - but many will welcome a distraction - particularly if you keep the request brief. Create a nicely styled, but simple page, and add a couple of message on there. Some examples could include:

  • Follow us on Twitter / Like us on Facebook: you could incentivize this, for example: if you're a coffee shop, then offer a free latte to new followers
  • Sign up to our email newsletter: this will only take them a second if you make sure the form is right there on the page, and again this can be incentivized
  • Don't forget to check in on foursquare: ideal for almost any location, and this is as good a time as any to remind them to check in
  • If you're enjoying your stay, please review us: particularly useful for hotels, where online reviews can increase visibility; I'll go into a little more detail about this below.

There can be some issues with sites noticing that a lot of people from the same IP are visiting, particularly when it comes to review services. Local search expert David Mihm advised me that he's heard Yelp in particular does try to filter our multiple reviews from the same IP, and that TripAdvisor's fraud rules do include clauses that might get you into trouble (such as offering incentives for people to write reviews is not permitted.)

I'd recommend that there are two steps around this type of issue:

  1. Try to appeal for reviews only from people who already have accounts on those sites (e.g.: "If you're a Yelp member, please review us here...." or "If you have a Google account, please leave a review here..."
  2. Make this 'post-wifi-login' page available on the public internet; review sites should be able to recognize that lots of people are being referred to your page from the same URL - if it's public then they'll be able to visit that page, and should figure out what is going on.

I've built a quick free template for you to to download as a starting point. You can visit the file, or download it, by clicking this link: free wifi login CTA page.

(That was created based on a template from LayoutGala; I'm not going to add any licence to it, other than use it however you want. You should change the image that are in it to be local files at the very least.)

Honestly, it doesn't take long to print off a couple of small posters (or even to publish a nice wifi login page) so I'll hope to see social-media CTAs cropping up all over the place soon. :)


Do you like this post? Yes No

AddThis Social Bookmark Button

LDA – Is On-Page Optimization the SEO Secret?

September 4th, 2010 Dana Lookadoo Posted in Uncategorized No Comments »

Posted by Dana Lookadoo

How do I recap the SEOmoz PRO Seminar session on Uncovering a Hidden Technique for SEO? The title is so attractive that it produces Pavlonian symptoms as we salivate at the thought of uncovering a hidden SEO treasure. Ben Hendrickson of SEOmoz presented a model which appears to show how Google may assigning relevance to keyword terms based on context - topical relevance.

Is Latent Dirichlet Allocation (LDA) that hidden jackpot?

1st - LDA is not new nor something SEOmoz invented. The Information Retrieval model has been around for 7 or 8 years, and IR geeks have talked about it before. There are a number of resources, as well as nay saying, about LDA and Google's possible use of it.

2nd - What is new is SEOmoz's LDA Topics Tool that produces a relevancy score based off a query (search term). It enables one to play with words that may increase a page's relevancy in the eyes of Google. It shows words that help Google determine how relevant the page is to a user's search query.

Game Changer?

Kyle Stone tweeted that the LDA tool is a game changer, and many retweeted.

SEOmoz LDA tool = game changer

Is SEOmoz's LDA tool a game changer? That's yet to be seen. The goal is to report Ben's research as presented at the Mozinar and how a layman (myself) interprets such. Rand is going to do a follow-up post to explain more.

Why all the hype?

The SEO Challenge

SEOs face the continual challenge of figuring out Google's hidden ranking algorithms. How do we rank higher? Which signals are the most important? We know search engines are "learning models" that attempt to understand "context” of words. Google has said for years that webmasters should concentrate most on providing good relevant (contextual) content.

There are ways to rank higher. Is it as easy as 1, 2, 3?

  1. Create quality copy with keyword(s) on the page along with associated anchor text links.
  2. Get good links.
  3. What Ben talked about in this session.

LDA - Topic Modeling & Analysis

Latent Dirichlet Allocation, in layman's terms, translates to "topic modeling." In search geek terms, LDA is the following formula:

LDA Formula

(Did you digest that? Don't worry; Mozzers groaned and laughed at the same time. PLUS: Scientist Hendrickson delivered this session after lunch!)

LDA Simplified - Here is Ben's way of explaining topic modeling:

LDA Formula Simplified

(Okay, I was once proud that I got an A in Logic and Combinatorics - discrete math/set theory. However, that computer science class now feels like basic math compared to this formula.)

It made more sense when Rand Fishkin joined Ben on stage and when Todd Freisen moderated and deciphered during Q&A. (Manuela Sanches of Brazil was sitting next to me and said that Ben's "presentation needed subtitles!")

The objective of LDA, from my deciphering of Greek, is to understand how Google is using semantic contextual analysis combined with other signals, to define topics/concepts. It's how Google analyzes the words on a page to determine the "set" to which a word belongs - how relevant a search query is to pages in its database.

For example: How does Google assign relevance to the word "orange" on a page? They determine orange is related to the fruit set or to the color set by page context.

LDA Defined:

"Latent Dirichlet Allocation (Blei et al, 2003) is a powerful learning algorithm for automatically and jointly clustering words into "topics" and documents into mixtures of topics. It has been successfully applied to model change in scientific fields over time (Griffiths and Steyver, 2004; Hall, et al. 2008).

A topic model is, roughly, a hierarchical Bayesian model that associates with each document a probability distribution over "topics", which are in turn distributions over words."

Bayesian - ah, a term I recognize!! Bayesian spam filtering is a method used to detect spam. It draws off a database and learns the meaning of words. It's "trained" by us when we mark an email as spam. It looks at incoming emails and calculates the probability that the content of an email is contextually spammy.

I found a PowerPoint presentation about Bayesian Inference Techniques by Microsoft Research from 2004 that presents the possibility of using LDA. Go to slide 54 and read:

"Can we build a general-purpose inference engine which automates these procedures?"

Microsoft has been looking at LDA models. Do search engines use it as one of their primary methods?

Ben sampled over 8 million documents with approx. 1,000 queries. He believes Google is using LDA topic modeling to determine (learn) what words mean by their associations with, relevance to, other words on the page. (Other factors are included.) Ben called the results a "co-occurrence explanation" that use a "cosine similarity."

SEO Takeaway:

  • Results that are higher in Google SERPs, in general, have more topical content.
  • Search engines do APPEAR to apply semantic analysisÂ? when indexing a page and determining the intent of the words on the page.

Rand tweeted an explanation (in 140 x 4) as follows:

Rand's tweets explaining LDA

Dana's LDA Catwalk Metaphor for Topic Modeling:

Imagine the words on your page as walking down the fashion runway in Paris. Your keyword phrase is "dressed" in semantic accessories, words that correlate to and dress up your topic. Associated words bring meaning to and highlight the fashion model's outfit. Adjectives, modifiers and synonyms are like jewelry, hats, and shoes. The combination can transform your base layers (your target terms) from casual or conservative business attire into a sexy night-on-the-town ensemble.

Combinations and permutations of words on a page "dress" your skinny or curvy fashion model. Relevant words provide Google with an image of what she is wearing and the catwalk upon which she struts. LDA refers back to what Google already knows about these "accessories" (words) and their previous association with the topic terms related to fashion.

Enter Topical Ambiguity - I just broke the "rules" for context with the catwalk metaphor by referring to modeling in two contexts on this page:

  • I used "modeling" terms that relate to the "fashion industry" set.
  • The catwalk metaphor is irrelevant content that is off-topic for discussing "LDA topic modeling."

Google Algorithm Exposed?

Ben clearly said that LDA is an ATTEMPT to explain the SERPs. His scenario, a quote from his presentation slides, follows:

One of us needs to implement it so we can:

1) See how it applies to pages
2) See if it helps explain SERPs
One-two-three-not-it.

LDA is not LSI.

There were some tweets claiming SEOmoz was bringing back LSI or snakeoil. Ben clarified that LDA is not LSI, which deals more with keyword density. He explained that he is NOT talking about loading keywords on a page but about the relevance of the topics within the page. He said that:

"LSI doesn’t have the same bias toward simple explanations. LSI breaks down as you try to scale up the number of topics."

The LDA tool deals with context, semantic relevancy, not density - in addition to some other random factors. Example:

If SEOmoz has a page all about "SEO" and "tools," and there is another word on the page that can be explained by a word that is more related to SEO topic, then the related word would be used. Meaning, "seo tools" doesn't have to be repeated over and over, and the related word would be interpreted by Google as being relevant.

Ben, who appears to have the brain of a search engine, noted that it "appears" LDA is what Google is heading for in the near future. He said (paraphrased):

If they are not doing it, they seem to be doing something that has the same output. They are probably already using it.

Rand deciphered:

It’s a super weird coincidence if Google is not using it.

Are On-Page Signals Stronger than Links?

Are we heading toward more emphasis of on-page topic modeling? I'm not an IR geek, but I do plan to spend more energy focusing on understanding how search engines retrieve informaton. We are dealing with a semantic Web. LDA may indicate that good old on-page optimization sends stronger signals than links.

SEOmoz's LDA tool attempts to show how relevant content is to a chosen keyword. It computes relevance of queries.

The following shows how relevant SEOmoz's Tools page is to Aaron Wall's SEO Book Tools page.

seo tools relevance for SEOmoz & SEO Book

The score at the top is an indicator of how relevant the content on that page is according to LDA.

  • Aaron's content is 72%* relevant for the query "seo tools."
  • SEOmoz's tools page is 40%* relevant.

*NOTE: (I inserted the logos.) You can run the same pages and get different results. The results are similar in that SEO Book always scored as more topically relevant, but the percentage varies. Is this the random Monte Carlo algorithm at work? Ben?

Mozinar Question:

"How do we execute this for SEO?"

Ben's Answer:

"I don't actually do SEO. I write code."

That's up to us, the SEOs, to play and test in our Google playground.

Use the tool to decide if you can win with LDA to optimize your on-page signals.

  1. Use the LDA Topics Tool to return words that could be used on a page for a query.
  2. Then determine who is ranking for that term.
  3. Simply write content that is highly on-topic based off the findings you observe.

If you are not performing that well in the SERPs, think about classic on-page optimization. In the example above, rather than putting another instance of "seo tools" on the page, LDA shows there are better ways to tell Google that you are about that topic. The tool provides a way to measure that.

IMPORTANT: There is a threshold at which too many related words will appear as too spammy. LDA is not something to be used to game Google.

Test the LDA Tool out for yourself, and draw your own conclusions.

***
DISCLAIMER: I'm not claiming this methodology has uncovered hidden SEO treasures. Time, testing and playing around with a new SEOmoz tool while observing the SERPs will reveal the answer. In the meantime, I'm going to dress up my pages and accessorize them with relevant terms that make them dazzle so they look good climbing the Google catwalk.


Do you like this post? Yes No

AddThis Social Bookmark Button

Four Creative Link Building Tactics – Whiteboard Friday

September 3rd, 2010 Aaron Wheeler Posted in Uncategorized No Comments »

Posted by Aaron Wheeler

 In this week's Whiteboard Friday Rand Fishkin clues you in on four link building tactics that you likely haven't heard about. Given the importance of link building to SEO, this video should prove to be worth its (virtual) weight in gold. (I mean that in the best possible way ;-p)

Wistia View statistics for this video
Embed video
<object width="640" height="360" id="wistia_174843" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"><param name="movie" value="http://seomoz-cdn.wistia.com/flash/embed_player_v1.1.swf"/><param name="allowfullscreen" value="true"/><param name="allowscriptaccess" value="always"/><param name="wmode" value="opaque"/><param name="flashvars" value="videoUrl=http://seomoz-cdn.wistia.com/deliveries/03f8ba29261b82e8cb35f0e4ca815aac8fb05286.bin&stillUrl=http://seomoz-cdn.wistia.com/deliveries/84d0a346a0b96ddee80f29e3c55a927d31548e09.bin&unbufferedSeek=false&controlsVisibleOnLoad=false&autoPlay=false&playButtonVisible=true&embedServiceURL=http://distillery-app.wistia.com/x&accountKey=wistia-production_3161&mediaID=wistia-production_174843&mediaDuration=397.13"/><embed src="http://seomoz-cdn.wistia.com/flash/embed_player_v1.1.swf" width="640" height="360" name="wistia_174843" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" wmode="opaque" flashvars="videoUrl=http://seomoz-cdn.wistia.com/deliveries/03f8ba29261b82e8cb35f0e4ca815aac8fb05286.bin&stillUrl=http://seomoz-cdn.wistia.com/deliveries/84d0a346a0b96ddee80f29e3c55a927d31548e09.bin&unbufferedSeek=false&controlsVisibleOnLoad=false&autoPlay=false&playButtonVisible=true&embedServiceURL=http://distillery-app.wistia.com/x&accountKey=wistia-production_3161&mediaID=wistia-production_174843&mediaDuration=397.13"></embed></object><script src="http://seomoz-cdn.wistia.com/embeds/v.js" charset="ISO-8859-1"></script><script>if(!navigator.mimeTypes['application/x-shockwave-flash'])Wistia.VideoEmbed('wistia_174843',640,360,{videoUrl:'http://seomoz-cdn.wistia.com/deliveries/03f8ba29261b82e8cb35f0e4ca815aac8fb05286.bin',stillUrl:'http://seomoz-cdn.wistia.com/deliveries/84d0a346a0b96ddee80f29e3c55a927d31548e09.bin',distilleryUrl:'http://distillery-app.wistia.com/x',accountKey:'wistia-production_3161',mediaId:'wistia-production_174843',mediaDuration:397.13})</script> <a href="http://www.seomoz.org/">SEOmoz - SEO Software</a>
 

Video Transcription

 

Hey, SEOmoz fans!  Welcome to another edition of Whiteboard Friday.  Today we're talking about link building and specifically four tactics that are relatively creative, not talked about a ton in the SEO sphere, that can help you get some direct links to virtually any kind of site.

Let's start with number one up here, giving testimonials.  I know this sounds a little odd.  You're thinking to yourself, "Wait, I'm a marketer.  I should be trying to get testimonials about my product, my service, my company."  But in fact, give and you shall receive.

So in this case, if are you are a site owner and you have a business and you say nice things about a product that you use, products that you like, free web apps, tools on the webs, blogs, resources, whatever it might be, or specific products or companies, and you email them and say, "Hey, I just wanted to let you know, I really like your service.  I enjoy using it.  If you'd like to use this as a testimonial, feel free."  You can say some nice words and then have a, "My name is Rand Fishkin and I am the CEO of SEOmoz."  When they publish that, they will take it and put it on their GoodProduct.com website, and you can see that gets embedded right into their site and it will link back over to your site.

So, it is a great way to build up a repertoire of contacts, build good relations, and do something nice for the people who are doing something nice for you.  I would definitely not do this disingenuously.  Make sure that you are actually recommending things that you would recommend to a real friend.  It will come back and bite you otherwise.  But if you do this, you can get those great links too.

The second one, design galleries.  This is an odd case because you do have to jump through some hoops.  If you can contract some of those exceptional, high quality, CSS and web design folks to build a really great looking site, something that looks nothing like this horrific drawing.  I don't even know why I put so many boxes and lines.  I am sure there was a reason.  You can get featured on sites like CSS REMIX or Drawer or CSS Gallery.  If you do a search for CSS galleries, in fact, you will find literally hundreds in the first few hundred results of places where you can get a live link pointing back from those pages just by submitting your site and having a site that looks great.

Now, what I would recommend is that before you go through the design process make sure that you visit a lot of these places and get inspired.  See what makes it.  See what is hot right now.  Those designs have the added benefit of being often very good for users.  Using CSS properly means that you're loading pages, you are keeping code and design separate.  It can often increase your rate of attracting links as well.  Linking and quality of design are a direct relationship.  As the quality of design rises, so too does the likelihood that people of all kinds, not just design galleries but of all kinds, will link to your site.  They'll find you more credible.  They'll want to show you off.  They'll want to share.  This is a great investment both for the direct links you can get and for the future.

Number three.  This is sort of an interesting one.  Thanks to sites out there like HARO, which is Help a Reporter Out, and a few others, I think PR Newswire runs one as well, you can be a press source simply by combing through databases or lists of people who say, "Hey, I am a reporter in need of a story about a business that keeps dogs in their office and what the impact of having dogs around is.  Can we interview you, show off your business?"  Those stories when they get written about, they might appear in sources as big as "The New York Times" or as small as your local newspaper, but they appear online as well.  When they do, that link will point back to your site giving you a link from a nice press resource, which is a great place to get a link.

Number four, the last one here, turning raw numbers into a data story.  I like this a lot because the idea here is that people produce a lot of interesting data about virtually every industry, but they don't always do great things with that data.  They'll produce interesting numbers or numbers that seem boring on their surface but can be used in interesting ways.  It is up to you to be creative about, hmm, okay, comScore published this, Nielsen published that, Forrester published this data research.  If I combine some of those numbers or if I think about how they play out, I can come up with a great story and maybe some cool graphics too about what that means.  I can take some of the data over time and build a story about what's happening.  I can show that data next to something like Google Trends data or Search Insights data or data from a second or third source.  When I combine those, I have great link and media bait.  The nice thing about producing this is it is not just sort of classic link bait where, "Oh, that's interesting, I want to share that." But it is interesting because when you are the reference resource for the data, everyone else who writes about the story or who wants to share it has to link back to you.

A good example of this, check out www.seomoz.org/dp/free-charts and you'll see a bunch of places where we have taken data from great folks like Eightfold Logic used to be Enquisite, comScore, Hitwise, Nielsen, Forrester, and we've combined them into unique and interesting ways to view that data.  We didn't even do much with it, just showed sort of, "Hey, they said that 30% of searches come from Europe and 40% come from Asia, etc., so we're going to build a pie chart of that that looks great and people can embed that."  Now when they do, they link back to SEOmoz and have the source in there.  We'll always say what the original source is too.  But by hosting this stuff and creating it, you get all these great links.

All right everyone, I hope we have helped out your link building efforts here today.  I look forward to the discussion in the comments.  We will see you again next week for another edition of Whiteboard Friday.  Take care.
Video transcription by SpeechPad.com

If you have any other advice that you think is worth sharing, please post it in the comments! This post is very much a work in progress.


Do you like this post? Yes No

AddThis Social Bookmark Button





OK!