This work is part of the project Using web-content for the assessment of macroecological patterns in butterfly-hostplant associations at a global scale, one of the 2013 Rubenstein Research Fellowship Awards.
We are interested in developing the most efficient way to extract existing knowledge about interactions from the biological literature and the biological community and make it available in a useful fashion.
We will focus in herbivory, specifically in butterfly-hostplant association to illustrate the application of the protocol proposed by our research team.
Although butterflies might not be representative of the entire diversity of herbivorous insects, they are practically the only larger taxonomic group of plant-feeding insects for which, relatively exhaustive and reliable data on host plant affiliations exist. We think that the methods developed in here could be expanded to include other taxonomic groups and interaction types, as long as enough information is available.
We will consider three superfamilies within the Lepidoptera:
- Papilionoidea (true butterflies).
- Hesperioidea (skippers).
- Hedyloidea (moth like butterflies).
We will use a tentative global checklist assembled by collating data from authoritative checklists and on-line resources.
For each butterfly species in the checklist we will search information in different sources. Next, we will use regular expression matching to select the text objects or files containing key words related with host plant association.
The selected text objects/files will be scanned for scientific names. The complete list of names will be evaluated to extract butterfly species and its synonyms, and to extract plant names (for herbivory associations), ant names (for ant associations), etc.
We will list all plant species, genera and family that have been recorded reliably as larval host plant. For species with valid host plant names we are going to fetch distribution information.