Note: Blocking will stop further edits: the bot will intermittently retry errors for several minutes, but should then automatically shut itself down until restarted manually; please use a ten minute block or longer to be sure of stopping it.

This bot is designed to add standardized machine-readable geodata records to relevant articles in the English-language Wikipedia, using data from GNS, GNIS, OSGB coordinates in UK articles, plaintext geodata scraped from article text, and interwiki-linked geotag data from other-language Wikipedias. -- The Anome 12:13, 22 September 2007 (UTC)

Status

- The Anome (talk) 14:10, 13 September 2008 (UTC)

Update: As of 2009-06-12:

Recent activities

Currently backfilling a number of corner cases missed by earlier over-cautious heuristics, using:

This is very laborious for the bot, as it requires the re-scanning of large numbers of false positives, and will result in only a few hundred articles being geocoded, but machine time is cheap, the re-scans are necessary in any case, and this will lay the foundations for larger systematic efforts to come later.

-- The Anome (talk) 14:30, 14 September 2008 (UTC)

Done. -- The Anome (talk) 04:29, 15 September 2008 (UTC)

Current activity

Finishing adding a large number of ((coord missing)) tags. Almost complete. -- The Anome (talk) 23:50, 13 October 2008 (UTC)

To do

Geotags:

Interwiki:

Consistency and correctness:

Matching:

New data sources:

Infoboxes:

-- The Anome (talk) 13:47, 12 October 2008 (UTC)

Forthcoming attractions

With >70,000 data points, I now have enough data to do a spatial analysis of the category tree, and to generate lists of possibly misclassified or mislocated outliers. The cleaned up bounding data could then be used as a Bayesian classifier for future work. -- The Anome 10:14, 24 August 2007 (UTC)

Ambiguity problems

Because of severe name ambiguity problems,

-- The Anome (talk) 13:58, 12 October 2008 (UTC)