Microdata is a WHATWG HTML specification used to nest metadata within existing content on web pages. Search engines, web crawlers, and browsers can extract and process microdata from a web page and use it to provide a richer browsing experience for users. Search engines benefit greatly from direct access to this structured data because it allows search engines to understand the information on web pages and provide more relevant results to users. Microdata uses a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata is an attempt to provide a simpler way of annotating HTML elements with machine-readable tags than the similar approaches of using RDFa and microformats.
The W3C HTML Working Group failed to find an editor for the specification and terminated its development with a 'Note' in 2013.
At a high level, microdata consists of a group of name-value pairs. The groups are called items, and each name-value pair is a property. Items and properties are represented by regular elements.
- To create an item, the itemscope attribute is used.
- To add a property to an item, the itemprop attribute is used on one of the item's descendants.
Google and other major search engines support the Schema.org vocabulary for structured data. This vocabulary defines a standard set of type names and property names, for example, Schema.org Music Event indicates a concert performance, with startDate and location properties to specify the concert's key details. IIn this case, Schema.org Music Event would be the URL used by itemtype and startDate and location would be itemprop's that Schema.org Music Event defines.
Note: More about itemtype attributes can be found at http://schema.org/Thing
Microdata vocabularies provide the semantics or meaning of an Item. Web developers can design a custom vocabulary or use vocabularies available on the web, such as the widely used schema.org vocabulary. A collection of commonly used markup vocabularies are provided by Schema.org.
Commonly used vocabularies:
- Creative works: CreativeWork, Book, Movie, MusicRecording, Recipe, TVSeries
- Embedded non-text objects: AudioObject, ImageObject, VideoObject
- Health and medical types: Notes on the health and medical types under MedicalEntity
- Place, LocalBusiness, Restaurant
- Product, Offer, AggregateOffer
- Review, AggregateRating
Major search engine operators like Google, Microsoft, and Yahoo! rely on the schema.org vocabulary to improve search results. For some purposes, an ad-hoc vocabulary is adequate. For others, a vocabulary will need to be designed. Where possible, authors are encouraged to re-use existing vocabularies, as this makes content re-use easier.
In some cases, search engines covering specific regions may provide locally-specific extensions of microdata. For example, Yandex, a major search engine in Russia, supports microformats such as hCard (company contact information), hRecipe (food recipe), hReview (market reviews) and hProduct (product data) and provides its own format for the definition of the terms and encyclopedic articles. This extension was made to solve transliteration problems between the Cyrillic and Latin alphabets. Due to the implementation of additional marking parameters of Schema's vocabulary, the indexation of information in Russian-language web-pages became considerably more successful.
itemid – The unique, global identifier of an item.
itemprop – Used to add properties to an item. Every HTML element may have an itemprop attribute specified, where an itemprop consists of a name and value pair.
itemref – Properties that are not descendants of an element with the
itemscope attribute can be associated with the item using an itemref. Itemref provides a list of element ids (not
itemids) with additional properties elsewhere in the document.
itemscope – Itemscope (usually) works along with itemtype to specify that the HTML contained in a block is about a particular item. itemscope creates the Item and defines the scope of the itemtype associated with it. itemtype is a valid URL of a vocabulary (such as schema.org) that describes the item and its properties context.
itemtype – Specifies the URL of the vocabulary that will be used to define itemprop's (item properties) in the data structure. Itemscope is used to set the scope of where in the data structure the vocabulary set by itemtype will be active.
<div itemscope itemtype="http://schema.org/SoftwareApplication"> <span itemprop="name">Angry Birds</span> - REQUIRES <span itemprop="operatingSystem">ANDROID</span><br> <link itemprop="applicationCategory" href="http://schema.org/GameApplication"/> <div itemprop="aggregateRating" itemscope itemtype="http://schema.org/AggregateRating"> RATING: <span itemprop="ratingValue">4.6</span> ( <span itemprop="ratingCount">8864</span> ratings ) </div> <div itemprop="offers" itemscope itemtype="http://schema.org/Offer"> Price: $<span itemprop="price">1.00</span> <meta itemprop="priceCurrency" content="USD" /> </div> </div>
Note: A handy tool for extracting microdata structures from HTML is Google's Structured Data Testing Tool. Try it on the HTML shown above.
Supported in Firefox 16. Removed in Firefox 49.