The Gist
We have hundreds of thousands of verbatim locality strings associated with mollusk specimen records. For example, the following lists some actual strings from our database.
"Coosa River, Riverside, Alabama."
"Mazatlan."
"Iburai, East Cape, Eastern Div., Papua."
For querying purposes, we would like these data to be more standardized, and we would also like these records to be georeferenced. This could be done by looking up those places in an atlas and then entering the appropriate data in the necessary fields. That would be very time consuming to say the least. Instead, we have opted to take advantage of the fact that fully parsed, hierarchically arranged and georeferenced localities-databases already exist. Our goal is to associate those parsed and georeferenced locality records with our verbatim strings and to do so in as painlessly as possible.
Databases already exist that contain normalized locality and georeferencing data. For example, NIMA maintains the GeoNet database for areas outside of the United States, and the U.S. Geological Survey provides a catalogue of place names for the rest. These databases are freely available to download.
Rather than parsing and georeferencing our data, we are automating the process of linking our verbatim strings to previously processed locality records. We can use the GeoNet and USGS databases to query our database to find matches. These matches can then be quickly evaluated for precision, and Viola! our verbatim strings are georeferenced.
|
|
Goals and Progress
Our goal is to make our georeferenced collection data available on-line. The following is a list of our milestones for this project. Those in gray are planned or underway; those in black have been done. Visit the Progress Page to learn more.
- Assign all (i.e., >95%) collection records with locality information to a country.
- Assign all collection records with taxon information to a genus and family.
- Capture for database manually georeferenced ANSP records done previously.
- Pre-computerization records.
- Computerized records.
- Expedition Charts.
- Complete "First Passes" over the data to assign rough georeferencing coordinates.
- Africa.
- South America.
- Australia, New Zealand, New Guinea, and the Solomons.
- Oceana.
- Southeast Asia and Indonesia.
- Central America and the Caribbean.
- Europe and Asia Minor.
- Asia.
- Pennsylvania.
- The rest of North America.
- Oceanic Localities.
- Refine the data and complete "Second Passes" to improve georeferencing coordinates and determine degree of precision.
- Africa.
- South America.
- Australia, New Zealand, New Guinea, and the Solomons.
- Oceana.
- Southeast Asia and Indonesia.
- Central America and the Caribbean.
- Europe and Asia Minor.
- Asia.
- North America.
- Oceanic Localities.
- Compare automated georeferencing with manual georeferencing.
- Serve our complete dataset to the world via the nternet.
|