Get the Wikipedia database!

Wikipedia, possibly world’s largest encyclopedia. It has over 2 million articles. I sometimes shudder at the thought that someday (maybe) if (for some reason) Wikipedia shuts down then what will happen to vast amount of data we have now! Surely, this hears like a dooms day, which we assume it will never ever happen. I am born collector. I love to collect anything which is very dear to my heart and is valuable. So, I was wondering today can I keep a backup copy of Wikipedia? If yes then how BIG would be the database?

As it turned out after some googling and more re-googling that there are more than one ways for this and the size of Wikipedia’s database (all articles, templates, image descriptions, and primary meta-pages, but no images) is 3.2 GB (compressed size – archive as on Jan 2008). Below are the various methods to get the database.

  1. Data dumps (link). These are raw data dumps of Wikipedia. You have options of various kinds of downloads. You can download the wiki in any particular language, or a particular subset of the database (e.g. only the titles of articles or only the abstracts of articles). This is the link to English Wikipedia’s Jan 2008 backup (3.2 GB download size).
  2. Wikipedia on DVD (download link) – only 422MB uncompressed size. This is the best way to download Wikipedia. This is the DVD version of Wikipedia, but has only selected articles and is quite outdated. I would sincerely urge Wikipedia to keep this updated. The DVD consists of a nice GUI provided by the software – Kiwix.
  3. Misc downloads (link). For example, Commons Picture of the year archive (link), MediaWiki – The website software Wikipedia.org uses itself (link).
  4. To download all the pictures of Wikipedia read here (to download the all image torrents click here). Note: that many of these pictures could be copyrighted. So, if you take full assume all liability for the use of any images. The download size could be as big as hundreds of GBs. The best way to download the pictures is download a subset of them. If you want to download only the pictures that are referenced by the XML file you download from here, then use Wikix.
  5. You can also download all XMLs for English Wiki from here.

So, happy life and stay informed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.