Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

My technology blog on SQL Server, TSQL, XML, FILESTREAM and other areas of SQL Server.
Browse by Tags · View All
XML 112
TSQL 69
XQuery 69
XQuery Functions 67
XQuery Training 65
XQuery in TSQL 64
XQuery Tutorial 63
SQL Server XQuery 63
XQuery-Labs 57
BRH 38

Archive · View All
September 2008 32
August 2008 30
July 2008 21
August 2009 19
June 2009 19
May 2010 18
January 2009 15
January 2010 14
October 2008 14
June 2008 13

XQuery Lab 67 - Reading blog data exported from blogger.com in TSQL using XQuery

Dec 29 2012 5:30AM by Jacob Sebastian   

I have been helping a few people to migrate their blogs from blogger.com to beyondrelational.com.  As part of this, I had to deal with the XML export file generated by blogger.com which contains information of all the posts, comments etc. I wrote a number of TSQL queries using XQuery to read various pieces of information from the XML data file.

I thought of sharing some of those queries in the XQuery labs series so that people who may want to do something similar may find it helpful.

For the purpose of this demonstration, I am presenting a cleaned-up version of the XML document which contains only the information we are interested in. When you look at a real file, you may find it much more complicated than the example presented below. However, the TSQL query given in this lab is found to be working on a few different version of the XML file I generated from blogger.com.

Sample Data

<?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:gd="http://schemas.google.com/g/2005" >
  <!-- 
    This is a cleaned-up version of the original file downloaded from 
    blogger.com. For the sake of simplicity, I have removed a number of 
    elements/attributes from the original file.
  -->
  <entry>
    <id>
      tag:blogger.com,1999:blog-7291718299109269124.post-3262230131972770257
    </id>
    <published>2012-12-29T02:26:00.004-08:00</published>
    <category scheme="http://schemas.google.com/g/2005#kind" 
              term="http://schemas.google.com/blogger/2008/kind#post" />
    <category scheme="http://www.blogger.com/atom/ns#" term="XQuery" />
    <category scheme="http://www.blogger.com/atom/ns#" term="XML" />
    <title type="text">This is the second test blog</title>
    <content type="html">
      This is another test blog to demonstrate XQuery Usage
    </content>
    <link rel="alternate" type="text/html" 
          href="http://test.blogspot.com/2012/12/this-is-second-test-blog.html" 
          title="This is the second test blog" />
  </entry>
  <entry>
    <id>
      tag:blogger.com,1999:blog-7291718299109269124.post-5774636022461277037
    </id>
    <published>2012-12-29T02:26:00.000-08:00</published>
    <category scheme="http://schemas.google.com/g/2005#kind" 
              term="http://schemas.google.com/blogger/2008/kind#post" />
    <category scheme="http://www.blogger.com/atom/ns#" term="XQueryLabs" />
    <category scheme="http://www.blogger.com/atom/ns#" term="BloggerTest" />
    <title type="text">This is my first blog post</title>
    <content type="html">
      This is a test blog used to generate the BlogML...
    </content>
    <link rel="alternate" type="text/html" 
          href="http://test.blogspot.com/2012/12/this-is-my-first-blog-post.html" 
          title="This is my first blog post" />
  </entry>
  <!-- Removed the other elements for simplicity -->
</feed>

The Query

The query below attempts to load an XML file from the disk and read information related to the blog posts available in the file.

DECLARE @x XML
SELECT @x = CAST(bulkcolumn AS XML)
FROM OPENROWSET(
BULK 'C:\temp\blogmltest.xml', SINGLE_BLOB
) AS x

;WITH XMLNAMESPACES
(
	default 'http://www.w3.org/2005/Atom'
)
select 
	x.value('id[1]','NVARCHAR(1024)') AS id,
	x.value('published[1]','DATETIME') AS Published,
	x.value('title[1]','NVARCHAR(MAX)') AS Title,
	x.value('content[1]','NVARCHAR(MAX)') AS content,
	x.value(
		'(link[@rel="alternate"]/@href)[1]',
		'NVARCHAR(MAX)') AS Link,
	x.value('(category[@scheme="http://www.blogger.com/atom/ns#"]
		[1]/@term)[1]','NVARCHAR(1024)') AS Cat1,
	x.value('(category[@scheme="http://www.blogger.com/atom/ns#"]
		[2]/@term)[1]','NVARCHAR(1024)') AS Cat2,
	x.value('(category[@scheme="http://www.blogger.com/atom/ns#"]
		[3]/@term)[1]','NVARCHAR(1024)') AS Cat3,
	x.value('(category[@scheme="http://www.blogger.com/atom/ns#"]
		[4]/@term)[1]','NVARCHAR(1024)') AS Cat4,
	x.value('(category[@scheme="http://www.blogger.com/atom/ns#"]
		[5]/@term)[1]','NVARCHAR(1024)') AS Cat5
FROM @x.nodes('/feed/entry') a(x)
WHERE 
	RIGHT(
		x.value(
			'(category[@scheme="http://schemas.google.com/g/2005#kind"]/@term)[1]',
			'NVARCHAR(1024)'),
		5
	) = '#post'

/*
id        Published    Title              content              Link               Cat1  Cat2  
--------- ------------ ------------------ -------------------- ------------------ ----- ----- 
tag:blo.. Dec 29 201.. This is the seco.. This is another te.. http://test.blog.. XQu.. XML   
tag:blo.. Dec 29 201.. This is my first.. This is a test blo.. http://test.blog.. XQu.. Blo.. 
*/

Note that I am assuming that there are only 5 categories/labels in each post. This could be very much wrong. In a real-world implementation, you may need a different method to parse each category. For example, something like the following:

DECLARE @x XML
SELECT @x = CAST(bulkcolumn AS XML)
FROM OPENROWSET(
BULK 'C:\temp\blogmltest.xml', SINGLE_BLOB
) AS x

;WITH XMLNAMESPACES
(
	default 'http://www.w3.org/2005/Atom'
)
select 
	x.value('id[1]','NVARCHAR(100)') AS PostID,
	y.value('@term','NVARCHAR(100)') AS Category
FROM @x.nodes('/feed/entry') a(x)
CROSS APPLY 
	x.nodes('(category[@scheme="http://www.blogger.com/atom/ns#"])') b(y)
WHERE 
	RIGHT(
		x.value(
			'(category[@scheme="http://schemas.google.com/g/2005#kind"]/@term)[1]',
			'NVARCHAR(1024)'),
		5
	) = '#post'

/*
PostID                                         Category
---------------------------------------------- -----------
tag:blogger.com,1999:post-3262230131972770257  XQuery
tag:blogger.com,1999:post-3262230131972770257  XML
tag:blogger.com,1999:post-5774636022461277037  XQueryLabs
tag:blogger.com,1999:post-5774636022461277037  BloggerTest
*/

Note that the XML file will contain different type of content, such as comments, posts, settings, templates etc. It is important to apply the filter that you see in the WHERE clause above, to ensure that we are reading only the blog posts.

Note that I am not an expert of blogger file format. The information shared above is based on my recent experience reading a file. The approach mentioned above may not be the best you might want to do in your environment. It is also possible that the file format may change over period of time.

Happy holidays!

View All Labs: XQuery Labs - A Collection of XQuery Sample Scripts and Tutorials

Tags: 


Jacob Sebastian
1 · 100% · 32235
4
 
0
Lifesaver
 
0
Refreshed
 
0
Learned
 
0
Incorrect



Submit

1  Comments  

  • Excellant

    commented on Jan 3 2013 1:20AM
    Madhivanan
    2 · 40% · 13039

Your Comment


Sign Up or Login to post a comment.

"XQuery Lab 67 - Reading blog data exported from blogger.com in TSQL using XQuery" rated 5 out of 5 by 4 readers
XQuery Lab 67 - Reading blog data exported from blogger.com in TSQL using XQuery , 5.0 out of 5 based on 4 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]