Lots to comment on here:
Jools wrote:The recent spate of incorrect occurrences have been due to errors in typography used in published info. As new errors appear we manual fix them and try to program around their happening again.
Thank you, Jools, for mentioning this. I meant to state it in my first post and forgot to do so. It is relevant because the underlying instigation for these corrective forum posts was a (small number) of errors in the database rather than ambiguities in the organization or structure of the database. Therefore...
MatsP wrote:I don't know what to add here. The data is entered largely by me (not counting the "type locality" markers). It is my fault if it's wrong, and usually it's because I have no "better knowledge".... Sorry if this comes across as defensive, but I have spent a lot of hours on actually creating this, and I don't think many of the database entries (5552 occurrences, ~1200 bodies of water, a few handfuls of countries) are by anyone else.
Mats, nothing personal. Jools was correct in another of his posts: IMHO, this thread and the issues raised are not an indictment of past work, but more a discussion about how the locality/occurrence database can be enhanced in the future.
Jools wrote:A marker with a star in it is a specific location usually a type locality.
A marker without a star in it is the middle point of a body of water.
Thanks again. That is about what I expected, although I didn't know the markers without stars were "midpoints" along bodies of water. When I've created new entries in the database, and the number is starting to add up, I've been placing the markers (selecting representative Lat/Long coordinates) by either using my best interpretation of location based on my source (e.g., if a paper states that fish were collected in streams of River X located a certain # of km east of city ABC, then I look at maps and select lat/long coordinates which are as close as I can discern to that described place), or when a locality is simply River X near city ABC, I select lat/long coordinates near the city. I've been using this website (
http://www.mapcoordinates.net) to extract the lat/long coordinates which correspond to the locality I'm looking up on the map, when exact lat/long data aren't provided by the original source. I'll be more careful in the future to identify lat/long coordinates near the midpoint of water bodies when I have nothing more to go on. "My bad."
MatsP wrote:My aim has been that "every fish should appear on the map", rather than having two fish out of a genus of thirty being marked. Even if the markers aren't exactly where the fish is being found.
MatsP wrote:If there is a (simple) way to import data from elsewhere to display where the fish comes from, then that's great. Even if it's not complete. But we still need a way to do the "find fish from the same place" or "search for species from X", which I don't believe the GBIF will give [directly at least]
Both of the points Mats makes are very important to me, and both are somewhat key to my ideas of how we (ha-ha; I mean you, since I can't write code. Sorry
) can enhance the admin functions. So here's some ideas/thoughts:
Currently, each species can have up to five distribution/occurrences/locality elements on its CLOG page:
- Type locality (if available) will appear textually at the top of the CLOG page
- The type locality will also appear as a starred pin on the interactive map IF the type locality data includes Lat/Long coordinates
- The Distribution field can include a textual description of the species' general localities; this may or may not reiterate the type locality, and more generally it can include additional textual descriptions of where the species is found. This field does not have any representation on the interactive map.
- The interactive map will show flags for all occurrences recorded manually by Mats and whoever else is able to add this info as admins; these pins are in addition to the type locality data which may or may not be pinned (see list #2 above).
- For every pinned occurrence on the map (with its own lat/long coordinates), a textual description (a hierarchical tree) will appear in the Distribution field above the map. IMPORTANTLY, it is this textual record which people can click on to "find fish from the same place" or "search for species from X."
Okay, so given all this, here we go:
MatsP wrote:To have a more precise locality we'd have to ... split the current bodies of water into even smaller sections.
Mats, you mentioned the idea of subdividing bodies of water. I think this is a good idea when a body of water can be subdivided into smaller segments, each with its own distinct textual name. But I would not think this is a good idea if it means taking a single river (e.g., Amazon River), and dividing it up every 2-3 km along the waterway if all of the segments are still known by the single name of the waterway. Why is this problematic? If the waterway was originally entered with a specific Lat/Long coordinate corresponding to a specific collection locality for once species, then that pin is more important than just "midway" along the river - it is the actual location of the fish. Now imagine that tomorrow I want to add an occurrence for a different species along the same waterway, but this fish is not from the same Lat/Long coordinates. I am faced two choices: Either
- use the preexisting entry for waterway and accept a pin in an incorrect location, EVEN THOUGH I have at my disposal a more correct set of Lat/Long coordinates to apply, or
- create a new entry in the occurrence database, with the same waterway name but with new Lat/Long coordinates. While that might work for me today, then the next time somebody comes along and wants to add an occurrence for some other (third species of) fish from the same waterway, they have to figure out which of these two entries they should use, or they will need to create a third entry if they have new Lat/Long coordinates.
IMO, having potentially multiple Lat/Long entries with the same "name" will be confusing to admins and users in general. So this is my suggestion or question:
Can we add an additional admin function related to occurrences which allows this? Admins can add new entries of Lat/Long coordinates and NOT provide a specific name for the entry, but in lieu of a name for this site, we select the corresponding (preexisting) waterway as a "parent." This way, the corresponding map point will be accurate for that species and the Distribution field will textually display the parent waterway on the CLOG page, but we won't be proliferating multiple occurrence entries with the same waterway name in the database. When a user views the CLOG page for that species, the textual Distribution field will list only the "Parent" waterway, but the map shows the pin for the new Lat/Long location. This way, the user can still search for fish nearby in the parent waterway, but it will allow multiple species to be specifically located at different positions along this same waterway. Of course, the drawback of this would be that multiple species from the same waterway, but not necessarily the exact same area of the waterway, will be listed together during searches; but this weakness has been discussed before (
Re: Enhancement idea: Create a page that inventories images/videos of locations and habitats) and I think the general consensus in the past has been that something was better than nothing (much like the issue at hand in this thread!)
Now, this suggestion only applies when there is
no better name for the particular subsection of the waterway involved in the posts. Obviously, when the new section can be named, then we just make a new entry, as is currently the system here. Mats, you also mentioned we can even add in small creeks. Well, that's what I did for
. It was collected from several small creeks along the Rio Blanco, which is near the end of the Rio Negro basin watershed (oddly, Google Maps calls this waterway "Blanco" but other web maps (e.g., mapquest.com) consider this same segment part of the larger Negro; but that's a whole different issue. And since this site uses Google Maps, I used Blanco instead of Negro as I set up the parent waterway entry in the database).
I named that entry after one of the creeks along the Blanco, but there were others. So is there a better solution here? There is the "Description" field beside every occurrence. For the creeks, I wrote " Rio Negro basin watershed creeks" in the description field. Admins are able to add this additional information in to occurrence entries, but none of this information appears on the clog page.
What if, for situations where we have unique Lat/Long coordinates along already named waterways, we could display this textual "Description" field on the end of the nesting, after the relevant parent field, even if this descriptive text wasn't a link to "find other fish here"? Or would you be willing to accept new occurrence entries with narrative names, rather than just proper names (e.g., typing "Rio Negro basin watershed creeks" in the "Name" field instead of in the "Description" field)?
I know that's a lot to chew on. Thanks for your indulgence.
Cheers, Eric
P.S., This suggestion is in addition to Racoll's and Jools' talk of overlaying GBIF data points to maps. I think that would be marvelous.