Validation & Verification Policy

Feb 22, 2005

National Biodiversity Network definitions

Validation: carrying out standardised, often automated checks on the completeness, accuracy of transmission and validity of the content of a record

Verification: ensuring the accuracy of the identification of the things being recorded

Data Validation

This is carried out during the import routine, and entails checking all aspects of the records (who, where and when) bar the species identification (the ‘what’).

  • Paper-based data are inputted by our Data Entry Officer and where relevant are associated with locations already set up on our database to ensure the geographic content is correct. Where new locations are being created, GiGL has access to a Recorder tool that takes the central grid reference of locations defined in GIS and imports them into the relevant locations. This results in the new locations having a 12 figure grid reference.
    • Where data do not relate to locations or it is not appropriate to create new locations, the spatial reference (grid reference, post-code) is checked for validity in London.
  • Electronic data always require some processing once they are received by GiGL, and manipulation into Recorder-ready format ensures the record content is correct.
  • GiGL undertakes most data analysis in MapInfo GIS and has a tool that extracts relevant data from Recorder’s 120+ tables that can then be mapped as points. The tool creates 3 separate layers:
    • Species – all species data on the Recorder database at full resolution
    • Habitats – all habitat records for locations in London
    • Open space – all PPG17-related information for locations in London, including access and facilities

Once these data are extracted, GiGL creates products and services based on the layers, including polygon data representing the habitat data. In linking the Recorder point data to site boundaries in GIS, further data validation is undertaken to make sure the points plot in the site boundaries that they represent.

  • GiGL can also undertake manual data validation in particular circumstances. Prior to importing data, they can be displayed as points in GIS and moved around in order to reflect their actual location rather than that estimated from a map. Once finished, new grid references for the points can be associated with the records, and these then provide the geographic content of the records.
  • When entering data into Recorder it is validated using a standard validation library. Many checks are done such as checking that vague dates are valid, that spatial references are valid for the spatial reference system and that taxa, biotopes, locations, individuals, organisations exist in the appropriate dictionary or list. Checks are also done for consistency, e.g. that the dates of the observations are within the allowed date range for the survey and that spatial references are within the bounding box of the survey.
  • GiGL check for duplicates in the database once per year via an automated process. Duplicates (where the species name, spatial reference, date and recorder name are EXACTLY the same) will be removed. If there is any doubt if a record is a duplicate GiGL will take the precautionary approach and leave it in the database, on the basis that it is better to report something twice than not at all.

Data Verification

GiGL aim to have all data verified but recognise that it’s a very labour intensive job and will require capable and co-operative volunteers, willing to sign a data use licence and enter into data exchange.

Overview

The process has three basic stages:

  1. Import or enter new data to GiGL Recorder database and use a batch update to set the determination type to correct verification category
  2. Extract data from the Recorder database in Excel format for verifying, append additional information if required, and pass to ‘expert’ for verification
  3. GiGL receive back updated spreadsheet and, use a batch update to update the determination type to new verification category

Species records stored in GiGL Recorder database are assigned to one of following verified categories, which will also appear in reports, where relevant:

  • Blank/ Null
    • Where record hasn’t been verified (GiGL database is a collated dataset, comes from a wide range of sources and all records are considered correct until such time as assessed by expert)
  • Correct
    • Where the record comes from a ‘trusted source’ or the identification has been confirmed by an ‘expert’ verifier
  • Considered Incorrect
    • Where a suitably qualified verifier has reason to believe that the identification is not correct, but the record needs to be retained, for example to allow for further investigation in the future
  • Unverifiable
    • Where it is impossible to verify the record, even with specimen

 Further detail

1. Import or enter new data to GiGL Recorder database

  • Data are entered into Recorder in one of three main ways. Input via the recording card or species record, import (from Excel etc.) or import from another Recorder System.  All records are considered correct by GiGL, and may be used in reporting, until such time as assessed by expert.
  • New records are assigned a verification category as follows:
    • GiGL’s Advisory Panel maintains a list of organisations considered ‘verified sources’. This list is subject to change but includes London Natural History Society, London Bat Group, Butterfly Conservation, other National Schemes and Societies, expert recorders and world experts. Records received from these sources are assigned a category of ‘correct’ when imported to GiGL Recorder database
    • The verification category of records received from all other sources is left Blank/ Null

In the Recorder database, determination types are used to control the verification system. Recorder 6 provides six built in determination types, which have been adapted by GiGL to reflect the categories above. Batch updates are used to assign the relevant category.

2. Extract data in Excel format for verifying. Pass to ‘expert’ verifier for verification

  • GiGL aim to have all records verified but are reliant on assistance from suitably qualified persons willing to undertake the work. GiGL’s Advisory Panel assesses the credentials and maintains a list of verifiers to ensure they are a locally accepted authority (e.g. county recorder) and taxonomicexpert.
  • LNHS recorders are offered first refusal to help with verification, but if it’s not possible then GiGL will notify theLNHS that we will be seeking an alternative
  • All verifiers must sign a data exchange and use agreement and agree to exchange their own data with GiGL
  • GiGL provide the verifier an Excel spreadsheet, which includes the Taxon_Occurrence_Key and all the information that the ‘expert’ carrying out the review would require to establish the reliability of the record, including:
    • all information submitted as part of original record
    • the vice county, SINC or site name, designated status (including IUCN endangered lists)
    • a list ordered by taxon
    • an indication if the record constitutes a new species to the GiGL database
    • an indication if the species is new to the tetrad (if possible)
    • any other information the verifier might reasonably need to be able to verify the records
    • Additional columns would be provided to record the experts view
  • The ‘expert’ is required to indicate if they think the record is correct or incorrect, to identify a possible alternative determination or, if necessary, to define some further action.
  • The timeline for verification is flexible to work with capacity of each individual verifier

GiGL will specify a list of species that we need verifying. This may be subject to change but will include those designations relevant in local authority planning and IUCN endangered lists (those replacing Red lists). The verifier is welcome to add more that they want to verify e.g. those species LNHS bird group define as ‘needing to see notes’ or where the verifier believes there are special problems with species identification.

3. GiGL Receive back updated spreadsheet

  • On receiving back the Excel file, GiGL carry out any follow up work that might be required, such as communicating with the original recorder to ask for further information that the verifier needs, such as a specimen.
  • Once the follow up queries have been resolved the Excel spreadsheet with the ‘expert’s’ views on the records are be converted into a .csv file and a Recorder 6 Batch update is used to change the verification category, via the determination type field.
  • Records are only ever re-determined, never deleted, and an edit trail is maintained because it’s always possible that records could get wrongly re-designated