Sunday, April 5, 2009

How to cite twitter, how to cite tweets, how to archive tweets

As twitter and microblogging is gaining momentum as a social phenomenon, a number of researchers start wondering how to cite tweets (for example here) and how to cite a whole thread (series of tweets).
A related issue is how to digitally preserve and archive tweets. For example, using twitter search, one can currently only search a few months back, older tweets are not retrievable. And while it is difficult to imagine that twitter won't be around for a while, it is not certain that the site still exists in 5, 10 or 20 years, making it impossible for future scholars to access the same information the author accessed.
A third related issue is dynamically changing content on twitter. For example, it doesn't make much sense to cite a search URL like http://search.twitter.com/search?q=twitter, because obviously the content is changing all the time.

For all these reasons, the WebCite tool comes in handy (http://www.webcitation.org) [3]. WebCite, which is endorsed by hundreds of scholarly journals, is a member of the International Internet Preservation Consortium (other members include for example the Internet Archive/Wayback machine) and works with libraries to make scholarly important digital material (including cited webpages, websites, online datasets etc) permanently accessible and "citable".

Here is how I use the WebCite tool to cite and archive tweets:

1. If I want to search all tweets by a given user, or tweets matching a hashtag or keyword, I use the search interface at http://search.twitter.com/ (or advanced: http://search.twitter.com/advanced) to generate a query searching for the username, a given hashtag etc, for example http://search.twitter.com/search?q=eysenbach


If I want to cite a specific tweet, I simply enter the entire tweet into the search interface, for example "http://search.twitter.com/search?q=Wondering+about+copyright+and+twitter.+Who+owns+intellectual+property+%2F+ideas+posted+on+twitter%3F+".
This is a workaround, as archiving the direct URL of the post (http://twitter.com/eysenbach/statuses/1457158115) currently seems to fail (http://www.webcitation.org/5foQZ3stR) (WebCite is working on this).


2. Copy and paste the search URL ( http://search.twitter.com/search?q=eysenbach) into the archive form of WebCite under "URL to Archive [url]:" and enter your email adress under "Your (citing author) E-mail Address [email]:", so that WebCite can email you a success/failure notice.



If you use WebCite regularly to cite other webpages etc., add the "WebCite this" bookmarklet to your browser. You can then archive any URL by just clicking the bookmarklet on your browser without having to navigate to the archive form of WebCite.


3. You're done! Cite the tweet or tweet thread as follows:


Eysenbach G (03-04-2009). wondering about how to archive my tweets (and friends' tweets) locally - any solutions out there? Retrieved from twitter.com, archived at http://www.webcitation.org/5foXLx2sm

or

[Multiple authors]. How to cite tweets. Search result retrieved on 2009-04-05 12:08pm from http://search.twitter.com/search?q=how%20to%20cite%20tweets, archived at http://www.webcitation.org/5foMuVHgy

As a side note, forget the APA or NLM styles [2] on how to cite blogs and websites. These citation styles leave out the most important aspect of citing a webpage or blog (which can change every minute or - in the case of twitter - every second), which is to archive it and to cite a permanent, archived snapshot - at least if the intention is that the reader sees the same as the author when he cited the tweet or series of tweets. In addition to the original URL, always cite the WebCite URL which links to a stable snapshot of the cited page.

For example, a URL like http://search.twitter.com/search?q=twitter is showing a different result every second. Only by "freezing" and archiving the result, (http://www.webcitation.org/5foXx28BB) the URL can and should be cited.

Limitations
Currently, the WebCite team is working on a few fixes, to make archiving and citing of tweets easier.

A current limitation is that the "show conversation" links in search results do not work if the search results are archived by WebCite (see e.g. http://www.webcitation.org/5foMuVHgy), presumably because javascript is used to retrieve that additional information.

Secondly, archiving the direct URL of a microblog (http://twitter.com/eysenbach/statuses/1457158115) and webciting the twitter homepage of a user works insofar that WebCite creates an internal copy of that page, but it doesn't diplay very well in the WebCite frame (appears for a few seconds and then disappears: See e.g. http://www.webcitation.org/5foQZ3stR or http://www.webcitation.org/5foR9anu2. (any hints on why this is - my guess is some javascript magic on these pages-, and how it can be fixed, are welcome). Thus, use the workaround of using the twitter search interface to archive tweets from a specific user or a specific hashtag, as described above.

Thirdly, the twitter search interface currently only allows the display of max 100 microblog entries (tweets) on one page, so that one WebCite snapshot has to be taken per search results page.

Fourthly, there is an urgent need for a tool allowing researchers to prospectively monitor and archive feeds from twitter, which is also something WebCite is working on (there are relations to the Infovigil [3] project, which allows advanced analytics such as trendanalysis, geographical coding, and links to polls).

Copyright
Ok, this question always comes up.. Using and archiving webpages should be covered under fair use clauses if the intent is scholarly communication. Twitter itself raises some interesting copyright issues, including the question if tweets reach the standard for copyrightability.

------
References
1. Eysenbach G, Trudel M. Going, Going, Still There: Using the WebCite Service to Permanently Archive Cited Web Pages. J Med Internet Res 2005;7(5):e60 URL: http://www.jmir.org/2005/5/e60
2. Patrias, K. Citing medicine: the NLM style guide for authors, editors, and publishers [Internet]. 2nd ed. Wendling, DL, technical editor. Bethesda (MD): National Library of Medicine (US); 2007 [updated 2009 Jan 14; cited Year Month Day]. Available from: http://www.nlm.nih.gov/citingmedicine
3. Eysenbach G. Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet. J Med Internet Res 2009;11(1):e11 URL: http://www.jmir.org/2009/1/e11

2 comments:

Craig said...

This was very helpful, thank you.

Tom said...

I'm sorry- but why would anyone want to archive there tweets?