Copyright © 2004 by Lawrence Ragan Communications, Inc. and reprinted from the May, 2004 issue of FoxTalk with permission of the publisher. For more information on FoxTalk, please visit www.pinnaclepublishing.com.


RSS: Publishing News via XML

By Ted Roche
New RSS news feeds are showing up regularly, as more and more software developers discover the usefulness of the technology in "getting the word out," whether for business or personal use. For example, a programmer blog can provide a great platform for documenting the joys or challenges of the developer's latest project or interesting solution. In this continuation of his RSS article in the April issue, Ted Roche shows how Visual FoxPro developers can easily create RSS feeds using just a few familiar and powerful VFP commands.

In last month's article on RSS, I talked about using RSS as a consumer, reading and writing RSS manually, subscribing to RSS feeds to read in a news reader, and publishing RSS using blogging software. This article will get into generating RSS programmatically with Visual FoxPro. I'll present the basic formats and discuss two ways you can produce RSS with VFP.

Format

RSS has a history similar to many ad-hoc standards, with different groups reaching ascendancy and claiming legitimacy, splintering and leaving behind legacy formats. There really are two formats of concern today, RSS 1.0 and 2.0. (A third contender, Atom–arguably not RSS at all as some of its authors insist–is perhaps just a variant. It's in early development, currently version 0.3, and wasn't considered for this article, but keep an eye on it.)

Despite their similarities, RSS 1.0 and RSS 2.0 are managed by two opposing camps. If you can support only a single format, RSS 2.0 seems to be the simplest and most prevalent. However, RSS 1.0 has a richer grammar and a more clearly defined means of extending the basic structure. Most tools support both formats and, as I'll show, it's not that difficult to produce "common denominator" output in both formats.

So, what is the basic structure? An RSS document (also known as "feed" since many are news feeds) consists of a header describing the source of the information (the news "channel") and a body with one or more articles, news feeds, quotes, or whatever you're shipping. In RSS 1.0, the items are siblings of the channel element and are listed in the <items> collection in the channel element. In RSS 2.0, the items themselves are contained within the channel element as sub-elements. The previous article showed an example of RSS 2.0, distinguished by the <rss version="2.0"> version tag near the top. The following code shows a typical RSS 1.0 feed. Note that the entire document is enclosed in <rdf:RDF> and </rdf:RDF> tags. This characterizes an RSS 1.0 document.

<?xml version="1.0" encoding="iso-8859-1"?>
<rdf:RDF 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
xmlns:dc="http://purl.org/dc/elements/1.1/" 
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:admin="http://webns.net/mvcb/" 
xmlns="http://purl.org/rss/1.0/">
  <channel xmlns=" 
rdf:about="http://www.tedroche.com">
       <title>Ted Roche &amp; Associates, LLC Web 
Site</title>
       <link>http://www.tedroche.com</link>
       <description>Changes to the web site of Ted Roche
&amp; Associates, LLC</description>
       <dc:language>en-us</dc:language>
       <dc:creator>
       </dc:creator>
       <dc:date>2004-01-07T16:55:48</dc:date>
       <admin:generatorAgent rdf:resource=
"http://msdn.microsoft.com/vfoxpro"/>
       <admin:errorReportsTo rdf:resource=
"mailto:tedroche@tedroche.com"/>
       <items>
              <rdf:Seq>
                      <rdf:li rdf:resource=
"http://www.tedroche.com/ConfGrid#2003"/>
              </rdf:Seq>
       </items>
  </channel>
  <item xmlns=" rdf:about=
"http://www.tedroche.com/ConfGrid#2003">
       <title>New White Papers: VFP and RSS</title>
       <description><![CDATA[The first of my white papers
from 2003 conferences...]]></description>
       <link>http://www.tedroche.com/ConfGrid#2003</link>
       <dc:creator>Unknown</dc:creator>
       <dc:date>2004-01-07T16:55:48</dc:date>
  </item>
</rdf:RDF>

Generating RSS

RSS can be generated in many ways. In the previous article, I mentioned Notepad as a crude but effective tool. Here, I'll use VFP in two different ways: with textmerge and with the MSXML COM object. Each technique has its advantages and disadvantages.

The VFP technique has the advantage of speed and simplicity of configuration. Textmerge excels in speed: Since all of the objects used are native VFP commands, generation is lightning fast. Installation of the RSS-generating VFP application requires nothing but the usual runtime install. In contrast, the MSXML object must be installed and configured on the target machine; I had some difficulties getting this to work on a Web server, and others have reported similar problems. The native VFP technique avoids this. (Note that XMLToCursor uses the MSXML COM object; you'll need to avoid that, too, if you're looking for a "pure" VFP solution.)

However, VFP doesn't have any native functionality to validate and manipulate XML; consequently, the resulting feed may not be correct. One of the best examples is in the preceding code block: The news feed title is "Ted Roche & Associates, LLC Web Site" and the ampersand character needs to be "escaped" into the form of &amp; to prevent parsing errors. Likewise, the greater-than and less-than signs are translated to &gt; and &lt; respectively.

Depending on the character set you choose for your feed, other characters should also be escaped to ensure the proper representation on the destination systems. In addition to the character set translation, use of the MSXML COM object ensures that some rudimentary validation will be done. The COM object will always produce valid and well-formed XML. With a Visual FoxPro object, you'll need to check the XML yourself.

To simplify the explanation of techniques in this article, you can assume that the data comes from two cursors: curHead contains the title, description, and link information needed for the channel (header) element, and curItem contains one record for each news item with title, link, and description for that article.

Generating RSS using FoxPro textmerge

FoxPro textmerge can generate XML quickly and efficiently with code like this, which produces an RSS 2.0 feed:

LOCAL lcXML as String, lcContents as String, lcFileName
STORE SPACE(0) TO lcXML, lcContents
#DEFINE CRLF CHR(13)+CHR(10)
lcFileName = "trweb.xml"

* Read header information
SELECT cTitle as Title, ;
  cLink as Link, ;
  mDesc as Description ;
FROM trhead ;
INTO CURSOR curHead

* Read news items
Select TOP 10 cTitle as Title, ;
  tUpdated as pubDate, ;
  mContent as Description, ;
  cLink as link ;
FROM trweb ;
ORDER BY tUpdated descending ;
INTO CURSOR curItem

* Generate the items - body'
SELECT curItem
SET textmerge TO memvar lcXML additive
SET TEXTMERGE ON noshow
SCAN
  \<item>
  \<title><<HTMLFix(curItem.title)>></title>
  \<description><<HTMLFix(curItem.Description)>></description>
  \<pubDate><<RFC822Date(DATETIME())>></pubDate>
  \<link><<HTMLFix(curItem.Link)>></link>
  \</item>
ENDSCAN
SET TEXTMERGE off
SET TEXTMERGE to

* Generate the heading and channel items and embed 
* the body within
Set Textmerge To Memvar lcContents
Set Textmerge On Noshow
  \<?xml version="1.0" ?>
  \<!--  RSS generated by <<VERSION()>> -->
  \<rss version="2.0">
  \<channel>
  \  <title><<HTMLFix(curHead.Title)>></title>
  \  <link><<HTMLFix(curHead.Link)>></link>
  \  <description><<HTMLFix(curHead.Description)>></description>
  \  <language>en-us</language>
  \  <copyright>Copyright <<YEAR(DATE())>></copyright>
  \  <lastBuildDate><<RFC822Date(DATETIME())>></lastBuildDate>
  \  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  \  <generator><<VERSION()>></generator>
  \  <ttl>60</ttl>
  \
  \  <<lcXML>>
  \
  \</channel>
  \</rss>
Set Textmerge Off
Set Textmerge To

* Write it out
lcSafety = Set("Safety")
Set Safety Off
Strtofile(lcContents, lcFileName, 0)
Set Safety &lcSafety

FUNCTION HTMLFix(tcString)
* This code ASSUMES the incoming string is ANSI,
* CHR(32) to (127) and has not already had the 
* characters converted - it will make a mess of a 
* string that already has strings like &#151; in it.
LOCAL lcString as string
lcString = STRTRAN(tcString,"&","&amp;")
lcString = STRTRAN(lcString,"<","&lt;")
lcString = STRTRAN(lcString,">","&gt;")
RETURN ALLTRIM(lcString)

Here's what's happening in the code. The curItems cursor has fields named "title," "description," and "link" to match the items needed for the body of the XML. In the source table, these are properly named cTitle, mDesc, and cLink, but they're changed with aliases in the SQL SELECT statement that generates the cursor. In a few more lines of code, the items list is generated with a scan and textmerge. Then, the main document is generated with textmerge, pulling values from the curHead cursor and hard-coding other values (these could be included within the curHead cursor too, if you need to extend the example). Finally, the file is written out with StrToFile().

The RFC822Date() function is a simple UDF included to convert a supplied FoxPro datetime value into the RFC 822 format required by the RSS 2.0 specification, something like "Thu, 27 Feb 2003 14:11:12 GMT." (The RSS 1.0 specification requires a different format, specified by ISO 8601, in the form YYYY-MM-DDTHH:MM:SSZ.) User-defined functions to perform both conversions are included in the accompanying Download. Strings are trimmed from their fixed-width table lengths, and ampersands, greater-than, and less-than signs are replaced with their equivalents in the HTMLFix() UDF.

Generating an RSS feed with Visual FoxPro is fast and efficient, and leaves the programmer in control of every byte that's written to disk. In exchange, the programmer has to ensure that the format is exactly correct and that character conversions are made, if needed.

Generating RSS using the MSXML COM object

Generating RSS using an XML processor such as the MSXML COM object can be a bit more complex and wordier up front, but it has some advantages. Is using the COM object any easier or "more correct" from an object-oriented standpoint? Probably not. Any method can be refactored into a proper set of objects and methods. An XML parser will know that text values need to be encoded properly to generate valid XML and will either perform the encoding automatically or throw an error that your code will have to handle. But it won't generate invalid XML, as the Visual FoxPro textmerge module certainly could do.

The details of creating an XML document are pretty straightforward: You invoke the MSXML COM object and tell it to create a new document. At that point, building the document you want consists of adding elements and attributes and assigning values to them. If you're new to XML, note that an element is a node enclosed within angled bracket tags. Each begins with an opening tag such as <item>, and ends with a closing </item> tag. An attribute is a name-value pair within the element tag that describes a property of the element. If I add an xmlns attribute to my item tag, it becomes <item xmlns=">. This code demonstrates how the process starts: It creates an XML document and adds the first few elements and attributes. The remainder of the code is very repetitious and is available as part of the Download.

loXML = newobject('msxml2.domdocument.4.0')
loXml.async = .f.
* Create the XML root
loNewItem = loXML.CreateProcessingInstruction([xml], ;
           [version='1.0' encoding='iso-8859-1'])
loXML.appendChild(loNewItem)
* Create the RSS root element
loDocument = loXml.createElement('rss') 
loXML.appendChild(loDocument)
* Add the RSS Version number
loAttribute = loXML.createAttribute([version])
loAttribute.value="2.0"
loDocument.attributes.setNamedItem(loAttribute)
* Add the channel element
loElement = loXML.createElement("channel")
loChannel = loDocument.appendChild(loElement)

When the RSS data has been completely added to the XML parser, save the data to disk by extracting the string from the loXML.XML property or by calling the internal loXML.Save() method. Errors that occur along the way can be caught with your error handler and processed accordingly (this is a great place to use VFP8's new TRY...CATCH structured error handling). As my use of these functions wasn't really a mission-critical one, I just chose to log the error to an error file and quit the generation process. Your error handling will reflect your needs.

Conclusion

Visual FoxPro is an excellent tool to generate and process RSS documents. Because RSS is simply XML, and XML is simply text, FoxPro's built-in text manipulation functions can make RSS generation fast and simple. Since Visual FoxPro is also a good host to COM objects, an XML parser driven by VFP is another technique that works well. The choice you make depends on your particular environment and application needs.

References

For much more information on RSS, check out the O'Reilly books Content Syndication with RSS by Ben Hammersley, Essential Blogging by Cory Doctorow et al, and Practical RDF by Shelley Powers. The O'Reilly site (www.oreilly.com) has dozens of articles on RSS, although they tend to favor version 1.0. Dave Winer, on www.scripting.com, is an advocate for version 2.0. Review the specifications for the two versions at http://web.resource.org/rss/1.0 and http://blogs.law.harvard.edu/tech/rss for the RSS 1.0 and 2.0 specifications, respectively. Finally, when you think you've generated a valid feed, check it out against the RSS Feed Validator at www.feedvalidator.org.

Download 405ROCHE.ZIP

Sidebar: Additional Resources

RSS Feeds of Interest:

• Rick Strahl–Several are listed at www.west-wind.com

• FoxForum Wiki–www.tedroche.com/FoxWikiRSS/FoxWikiRSS20.xml

• FoxCentral.net–www.foxcentral.net/foxcentralRssFeed.fc

• Fox KB Updates–www.kbalterz.com/rss/fox.xml

Blogs:

• Garrett Fitzgerald–http://blog.donnael.com

• Andrew MacNeill–http://akselsoft.blogspot.com

• Ted Roche–http://radio.weblogs.com/0117767

• MS Data Team–http://blogs.msdn.com/vsdata