Wiktionaries

Sometimes it feels like acronyms are the only words we use.

The government is awash with acronyms. New acronyms are created daily. Acronyms create a barrier to understanding if they cannot be easily resolved, where easy = universal and universal = URLs. There are many online dictionaries with entries that are found in Web searches. However, most of these return results only in highly formatted, not-well-formed HTML that is not always accessible through simple URLs. Furthermore, these dictionaries provide no way for the community to create and share new entries as they are needed. A simple solution to this is:

  • Use the cloud to store terms and definitions
  • Use Web services to return definitions through URLs as XML, JSON, and XHTML
  • Provide a simple form that lets users add and edit terms

We have created a proof of concept here using XAMPP and Amazon’s S3 Web service:

http://dev.os.bridgeborn.com/wiktionary/

It’s not perfect or even complete. For example, the XHTML returns 37 errors and 19 warnings from the W3C validator so it’s really not even XHTML. But I think this is a solid start and I think it can easily get to where we want without too much more effort and resources. Searching/sorting by domain is a must-have feature that I’m chomping at the bit to have implemented!

Creating new opportunities, not re-inventing wheels

There are a couple of services out there already that support some of the functionality we seek. acronyms.thefreedictionary.com provides some of the things we like to see from a service: references/citations, categories, direct URL to entry, notes, and anyone can update (don’t even have to auth).  Wikitionary.org is another useful service that supports citations, direct URLs, discussion, and collaborative updates.  Wiktionary also offers an API, which a good practice when rightly done.

These services fall short of our expectations, however.  freedictionary.com provides no API and the direct links to definition pages are full of ads, forms, and a myriad other distractions.  Wiktionary uses Wikimedia’s typical format for entries, which cuts out the distractions and focuses on the unit of knowledge.  But direct links to terms through the API are muddled with parameters and point to documents written mostly in wiki’s specialized format, not XML, XHTML, or JSON.

Contrast this with our wiktionary implementation, where pointing to XHTML, XML, and JSON is mostly a matter of changing the file extension of the result:

http://dev.os.bridgeborn.com/wiktionary/viewentry.php?term=XML
http://dev.os.bridgeborn.com/wiktionary/terms/XML.xml
http://dev.os.bridgeborn.com/wiktionary/terms/XML.json

(That first one needs to be: http://dev.os.bridgeborn.com/wiktionary/terms/XML.xhtml and it will be nice if we can simplify the path by removing “terms” from it.)

So, each of these services still leaves something to be desired.  To be clear, we are not claiming that our service is better than “theirs.”  But, bias not withstanding, if I were to choose a starting point I would definitely choose our service over the others.

Feel free to use our wiktionary (no warranty!).  It uses OpenID for login, so if you have accounts with any of these providers you’re good to go:

Eventually we’ll make it possible for you to create your own instance and/or get the source code, but we have some other priorities right now and just wanted to share what we have.

Credits: I created the wireframes and drove the requirements.  Our 2009 RIT Co-op student, Ian Wittenberg wrote the code

About Kevin Curry

Kevin is the Chief Scientist and a Co-founder of Bridgeborn. He is a graduate of Virginia Tech with a B.A. in History ('92) and M.S. in Computer Science and Applications ('99). He volunteers as a community organizer for Clean the Bay Day and board member for King's Grant Community League. Kevin is a native of Virginia Beach, Virginia, where he lives with his wife, daughter, and two pugs.
This entry was posted in Data Portability, Linked Open Data, R&D, Semantic Interoperability, Web Services. Bookmark the permalink.

Comments are closed.