I’m going to take a look at one of the advanced SEO topics that our 2013 workshops will cover. This will be a series of posts actually, and they are mostly targeted at advanced level of SEO, so if you’re just getting into the world of SEO, you might want to set this topic aside for a later date, after you’ve mastered more of the basics. With that proviso, let’s get into the world of structured data.

Introducing Structured Data

Structured data has hit the world of SEO. Are the buzzwords buzzing around your ears? Schema, rich snippets, Microformat, Microdata, RDFa, Dublin Core…you may have noticed all these terms floating around the blogosphere lately. Maybe you even understand them.

The increasingly complex world of SEO has a new and important dimension that a lot of SEO guys and gals are not tracking yet: and that dimension is defined by structured data.  Structured data is a general term, and you can probably figure out the point from the word itself. It merely refers to a way of approaching and categorizing masses of data and making sense of it.

Structured Data and SEO

Think of an old fashioned card catalog in a library (you millenials might have to look at some archival footage to know what I’m talking about). A card catalog is a way to take a huge amount of data in the form of books and journals and structure the outline of it into an organized form that allows you to reach anywhere into it and retrieve the data you need.

The Internet is a similar mass of data, and until now search engines have largely relied on a shotgun approach to indexing, namely indexing everything and then running algorithms over it to figure it out. Structured data is a way for the search engines to bring more order to this mass in a predictable way. This is not merely about search engines making their own job easier.  As the following schematic illustrates, it also has benefit to both webmasters and the visitors they attract, or want to.   It’s a classic example of shared interest.

 The Big Picture of Structured Data

Of course most websites already are created from structured data because most sites nowadays are database-driven, and virtually all databases have to deal with data in a structured format by definition of what they are.

The problem is that when the pages of the site are built, on the fly, from data in a database, typical web programming practices often discard the structure when they create the page in HTML.

For example, let’s say that a website is designed to show events. Most events carry a date. The date may be displayed on the page, but perhaps other dates appear on the page that don’t relate to the event. A search engine has the challenge of figuring out what date goes with what information. This is a very simplistic example, but it could be repeated a thousand different ways.

Enter Schema.org

The word “schema,” like the phrase “structured data,” is a general term that has been invested with a special meaning in the world of SEO.  Schema generally is a conceptual outline or hierarchy. For example in database design, the schema is the layout of columns, rows, and relationships that organizes the data. By the way, you probably can see that a database is an example of structured data, and that the schema describes how the data is structured.

Similarly, the companies that basically control Internet search, namely Google and Bing, need a way to both structure the data of our shared online universe and describe that structure so that webmasters can take advantage of it. Of course there have been standards that have arisen to fill this need, the best known are Dublin Core, Microformat, Microdata and RDFa (you don’t have to remember those names by the way), but having standards and knowing which ones to use are two very different things. So the guys and gals at Bing, Yahoo, and Google got together and decided to create a standard and base it on Microdata. The result of that effort is a website and standard found at www.schema.org.

Schema.org lays out a standard set of rules and definitions that extends, or enhances, HTML. The purpose is to allow website owners to consistently describe certain types of data in a way that search engines can, also consistently, understand. By understanding the data better, search engines can offer a “richer” display in their search results. This type of display is often referred to as “rich snippets.”

In my next blog post post, I’ll describe how structured data shows up in search as rich snippets, how you can incorporate it into your pages, and what it can do for your SEO efforts. Continue to Part 2: Introduction to Rich Snippets