schliessen

Filtern

 

Bibliotheken

Automated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks

Abstract Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing autom... Full description

Journal Title: Systematic biology 2019, Vol.68 (6), p.876-895
Main Author: Valan, Miroslav
Other Authors: Makonyi, Karoly , Maki, Atsuto , Vondráček, Dominik , Ronquist, Fredrik
Format: Electronic Article Electronic Article
Language: English
Subjects:
Publisher: England: Oxford University Press
ID: ISSN: 1063-5157
Zum Text:
SendSend as email Add to Book BagAdd to Book Bag
Staff View
recordid: cdi_swepub_primary_oai_DiVA_org_uu_400675
title: Automated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks
format: Article
creator:
  • Valan, Miroslav
  • Makonyi, Karoly
  • Maki, Atsuto
  • Vondráček, Dominik
  • Ronquist, Fredrik
subjects:
  • Animals
  • Annan biologi
  • Biological Sciences
  • Biological Systematics
  • Biologisk systematik
  • Biologiska vetenskaper
  • Classification - methods
  • Diversity of life
  • Insecta - classification
  • Livets mångfald
  • Natural Sciences
  • Naturvetenskap
  • Neural Networks, Computer
  • Other Biological Topics
  • Phylogeny
  • Regular
  • Regular Articles
  • Reproducibility of Results
ispartof: Systematic biology, 2019, Vol.68 (6), p.876-895
description: Abstract Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing automated systems for this task. Previous attempts have been based on laborious and complex handcrafted extraction of image features, but in recent years it has been shown that sophisticated convolutional neural networks (CNNs) can learn to extract relevant features automatically, without human intervention. Unfortunately, reaching expert-level accuracy in CNN identifications requires substantial computational power and huge training data sets, which are often not available for taxonomic tasks. This can be addressed using feature transfer: a CNN that has been pretrained on a generic image classification task is exposed to the taxonomic images of interest, and information about its perception of those images is used in training a simpler, dedicated identification system. Here, we develop an effective method of CNN feature transfer, which achieves expert-level accuracy in taxonomic identification of insects with training sets of 100 images or less per category, depending on the nature of data set. Specifically, we extract rich representations of intermediate to high-level image features from the CNN architecture VGG16 pretrained on the ImageNet data set. This information is submitted to a linear support vector machine classifier, which is trained on the target problem. We tested the performance of our approach on two types of challenging taxonomic tasks: 1) identifying insects to higher groups when they are likely to belong to subgroups that have not been seen previously and 2) identifying visually similar species that are difficult to separate even for experts. For the first task, our approach reached $CDATA[$CDATA[$>$$92% accuracy on one data set (884 face images of 11 families of Diptera, all specimens representing unique species), and $CDATA[$CDATA[$>$$96% accuracy on another (2936 dorsal habitus images of 14 families of Coleoptera, over 90% of specimens belonging to unique species). For the second task, our approach outperformed a leading taxonomic expert on one data set (339 images of three species of the Coleoptera genus Oxythyrea; 97% accuracy), and both humans and traditional automated identification systems on another data set (3845 images of nine species of
language: eng
source:
identifier: ISSN: 1063-5157
fulltext: no_fulltext
issn:
  • 1063-5157
  • 1076-836X
  • 1076-836X
url: Link


@attributes
NO1
SEARCH_ENGINEprimo_central_multiple_fe
SEARCH_ENGINE_TYPEPrimo Central Search Engine
RANK2.7222211
LOCALfalse
PrimoNMBib
record
control
sourceidproquest_swepu
recordidTN_cdi_swepub_primary_oai_DiVA_org_uu_400675
sourceformatXML
sourcesystemPC
oup_id10.1093/sysbio/syz014
sourcerecordid2187531179
originalsourceidFETCH-LOGICAL-1628t-e27645a21c10889ef1185c4bf5ed6d8a5c93b025510c04e17358340e880356060
addsrcrecordideNqNks1vEzEQxVcIREvhyBX5iAQL_lh_7AUpKi1EisqlRdwsxzubmO7awfYmDXf-bzZKacmFchpL_s2bN_YripcEvyO4Zu_TNs1dGMtPTKpHxTHBUpSKiW-Pd2fBSk64PCqepfQdY0IEJ0-LI4YV5UzS4-LXZMihNxkadGlugg-9s2jagM-uddZkFzwKLZr6BDYntHF5ic5uVhBzOYM1dGhi7RCN3aKr5PwCnbXtCLo1oHMweYiALqPxqYWI2hh6dBr8OnTDTtd06ALyJsTr9Lx40pouwYvbelJcnZ9dnn4uZ18-TU8ns5IIqnIJVIqKG0oswUrV0BKiuK3mLYdGNMpwW7M5ppwTbHEFRDKuWIVBKcy4wAKfFG_3umkDq2GuV9H1Jm51ME5_dF8nOsSFHgZdYSwk_z88DZpIyQkd8TcP4z72mgmJR7p8mL7OS0055nzn_cOeH-EeGjt-UTTdQdvhjXdLvQhrLdT4JLIaBS72AmEF3rgIB72Nh6wbaIaV3rR6jIqmDaUVYUpiqYihjWSKYkMErq2o69aOgq9vHcXwY4CUde-Sha4zHsKQNCVKckaIrO-XtTGkFKG9m02w3uVY73Os9zke-Vd_L3tH_wnu_ewwGv631m-0RwDL
sourcetypeOpen Access Repository
isCDItrue
recordtypearticle
pqid2187531179
display
typearticle
titleAutomated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks
creatorValan, Miroslav ; Makonyi, Karoly ; Maki, Atsuto ; Vondráček, Dominik ; Ronquist, Fredrik
contributorBuckley, Thomas
creatorcontribValan, Miroslav ; Makonyi, Karoly ; Maki, Atsuto ; Vondráček, Dominik ; Ronquist, Fredrik ; Buckley, Thomas
descriptionAbstract Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing automated systems for this task. Previous attempts have been based on laborious and complex handcrafted extraction of image features, but in recent years it has been shown that sophisticated convolutional neural networks (CNNs) can learn to extract relevant features automatically, without human intervention. Unfortunately, reaching expert-level accuracy in CNN identifications requires substantial computational power and huge training data sets, which are often not available for taxonomic tasks. This can be addressed using feature transfer: a CNN that has been pretrained on a generic image classification task is exposed to the taxonomic images of interest, and information about its perception of those images is used in training a simpler, dedicated identification system. Here, we develop an effective method of CNN feature transfer, which achieves expert-level accuracy in taxonomic identification of insects with training sets of 100 images or less per category, depending on the nature of data set. Specifically, we extract rich representations of intermediate to high-level image features from the CNN architecture VGG16 pretrained on the ImageNet data set. This information is submitted to a linear support vector machine classifier, which is trained on the target problem. We tested the performance of our approach on two types of challenging taxonomic tasks: 1) identifying insects to higher groups when they are likely to belong to subgroups that have not been seen previously and 2) identifying visually similar species that are difficult to separate even for experts. For the first task, our approach reached $CDATA[$CDATA[$>$$92% accuracy on one data set (884 face images of 11 families of Diptera, all specimens representing unique species), and $CDATA[$CDATA[$>$$96% accuracy on another (2936 dorsal habitus images of 14 families of Coleoptera, over 90% of specimens belonging to unique species). For the second task, our approach outperformed a leading taxonomic expert on one data set (339 images of three species of the Coleoptera genus Oxythyrea; 97% accuracy), and both humans and traditional automated identification systems on another data set (3845 images of nine species of Plecoptera larvae; 98.6 % accuracy). Reanalyzing several biological image identification tasks studied in the recent literature, we show that our approach is broadly applicable and provides significant improvements over previous methods, whether based on dedicated CNNs, CNN feature transfer, or more traditional techniques. Thus, our method, which is easy to apply, can be highly successful in developing automated taxonomic identification systems even when training data sets are small and computational budgets limited. We conclude by briefly discussing some promising CNN-based research directions in morphological systematics opened up by the success of these techniques in providing accurate diagnostic tools.
identifier
0ISSN: 1063-5157
1ISSN: 1076-836X
2EISSN: 1076-836X
3DOI: 10.1093/sysbio/syz014
4PMID: 30825372
languageeng
publisherEngland: Oxford University Press
subjectAnimals ; Annan biologi ; Biological Sciences ; Biological Systematics ; Biologisk systematik ; Biologiska vetenskaper ; Classification - methods ; Diversity of life ; Insecta - classification ; Livets mångfald ; Natural Sciences ; Naturvetenskap ; Neural Networks, Computer ; Other Biological Topics ; Phylogeny ; Regular ; Regular Articles ; Reproducibility of Results
ispartofSystematic biology, 2019, Vol.68 (6), p.876-895
rights
0The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. 2019
1The Author(s) 2019. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
lds50peer_reviewed
oafree_for_read
citesFETCH-LOGICAL-1628t-e27645a21c10889ef1185c4bf5ed6d8a5c93b025510c04e17358340e880356060
links
openurl$$Topenurl_article
thumbnail$$Usyndetics_thumb_exl
backlink
0$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30825372$$D View this record in MEDLINE/PubMed
1$$Uhttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-250556$$DView record from Swedish Publication Index
2$$Uhttp://urn.kb.se/resolve?urn=urn:nbn:se:nrm:diva-3670$$DView record from Swedish Publication Index
3$$Uhttp://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-177512$$DView record from Swedish Publication Index
4$$Uhttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-400675$$DView record from Swedish Publication Index
search
contributorBuckley, Thomas
creatorcontrib
0Valan, Miroslav
1Makonyi, Karoly
2Maki, Atsuto
3Vondráček, Dominik
4Ronquist, Fredrik
title
0Automated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks
1Systematic biology
addtitleSyst Biol
descriptionAbstract Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing automated systems for this task. Previous attempts have been based on laborious and complex handcrafted extraction of image features, but in recent years it has been shown that sophisticated convolutional neural networks (CNNs) can learn to extract relevant features automatically, without human intervention. Unfortunately, reaching expert-level accuracy in CNN identifications requires substantial computational power and huge training data sets, which are often not available for taxonomic tasks. This can be addressed using feature transfer: a CNN that has been pretrained on a generic image classification task is exposed to the taxonomic images of interest, and information about its perception of those images is used in training a simpler, dedicated identification system. Here, we develop an effective method of CNN feature transfer, which achieves expert-level accuracy in taxonomic identification of insects with training sets of 100 images or less per category, depending on the nature of data set. Specifically, we extract rich representations of intermediate to high-level image features from the CNN architecture VGG16 pretrained on the ImageNet data set. This information is submitted to a linear support vector machine classifier, which is trained on the target problem. We tested the performance of our approach on two types of challenging taxonomic tasks: 1) identifying insects to higher groups when they are likely to belong to subgroups that have not been seen previously and 2) identifying visually similar species that are difficult to separate even for experts. For the first task, our approach reached $CDATA[$CDATA[$>$$92% accuracy on one data set (884 face images of 11 families of Diptera, all specimens representing unique species), and $CDATA[$CDATA[$>$$96% accuracy on another (2936 dorsal habitus images of 14 families of Coleoptera, over 90% of specimens belonging to unique species). For the second task, our approach outperformed a leading taxonomic expert on one data set (339 images of three species of the Coleoptera genus Oxythyrea; 97% accuracy), and both humans and traditional automated identification systems on another data set (3845 images of nine species of Plecoptera larvae; 98.6 % accuracy). Reanalyzing several biological image identification tasks studied in the recent literature, we show that our approach is broadly applicable and provides significant improvements over previous methods, whether based on dedicated CNNs, CNN feature transfer, or more traditional techniques. Thus, our method, which is easy to apply, can be highly successful in developing automated taxonomic identification systems even when training data sets are small and computational budgets limited. We conclude by briefly discussing some promising CNN-based research directions in morphological systematics opened up by the success of these techniques in providing accurate diagnostic tools.
subject
0Animals
1Annan biologi
2Biological Sciences
3Biological Systematics
4Biologisk systematik
5Biologiska vetenskaper
6Classification - methods
7Diversity of life
8Insecta - classification
9Livets mångfald
10Natural Sciences
11Naturvetenskap
12Neural Networks, Computer
13Other Biological Topics
14Phylogeny
15Regular
16Regular Articles
17Reproducibility of Results
issn
01063-5157
11076-836X
21076-836X
fulltextfalse
rsrctypearticle
creationdate2019
recordtypearticle
recordideNqNks1vEzEQxVcIREvhyBX5iAQL_lh_7AUpKi1EisqlRdwsxzubmO7awfYmDXf-bzZKacmFchpL_s2bN_YripcEvyO4Zu_TNs1dGMtPTKpHxTHBUpSKiW-Pd2fBSk64PCqepfQdY0IEJ0-LI4YV5UzS4-LXZMihNxkadGlugg-9s2jagM-uddZkFzwKLZr6BDYntHF5ic5uVhBzOYM1dGhi7RCN3aKr5PwCnbXtCLo1oHMweYiALqPxqYWI2hh6dBr8OnTDTtd06ALyJsTr9Lx40pouwYvbelJcnZ9dnn4uZ18-TU8ns5IIqnIJVIqKG0oswUrV0BKiuK3mLYdGNMpwW7M5ppwTbHEFRDKuWIVBKcy4wAKfFG_3umkDq2GuV9H1Jm51ME5_dF8nOsSFHgZdYSwk_z88DZpIyQkd8TcP4z72mgmJR7p8mL7OS0055nzn_cOeH-EeGjt-UTTdQdvhjXdLvQhrLdT4JLIaBS72AmEF3rgIB72Nh6wbaIaV3rR6jIqmDaUVYUpiqYihjWSKYkMErq2o69aOgq9vHcXwY4CUde-Sha4zHsKQNCVKckaIrO-XtTGkFKG9m02w3uVY73Os9zke-Vd_L3tH_wnu_ewwGv631m-0RwDL
startdate20191101
enddate20191101
creator
0Valan, Miroslav
1Makonyi, Karoly
2Maki, Atsuto
3Vondráček, Dominik
4Ronquist, Fredrik
general
0Oxford University Press
1OXFORD UNIV PRESS
scope
0TOX
1CGR
2CUY
3CVF
4ECM
5EIF
6NPM
7AAYXX
8CITATION
97X8
10BOBZL
11CLFQK
125PM
13ADTPV
14AOWAS
15D8T
sort
creationdate20191101
titleAutomated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks
authorValan, Miroslav ; Makonyi, Karoly ; Maki, Atsuto ; Vondráček, Dominik ; Ronquist, Fredrik
facets
frbrtype5
frbrgroupidcdi_FETCH-LOGICAL-1628t-e27645a21c10889ef1185c4bf5ed6d8a5c93b025510c04e17358340e880356060
rsrctypearticles
prefilterarticles
languageeng
creationdate2019
topic
0Animals
1Annan biologi
2Biological Sciences
3Biological Systematics
4Biologisk systematik
5Biologiska vetenskaper
6Classification - methods
7Diversity of life
8Insecta - classification
9Livets mångfald
10Natural Sciences
11Naturvetenskap
12Neural Networks, Computer
13Other Biological Topics
14Phylogeny
15Regular
16Regular Articles
17Reproducibility of Results
toplevelpeer_reviewed
creatorcontrib
0Valan, Miroslav
1Makonyi, Karoly
2Maki, Atsuto
3Vondráček, Dominik
4Ronquist, Fredrik
collection
0Oxford Journals Open Access Collection
1Medline
2MEDLINE
3MEDLINE (Ovid)
4MEDLINE
5MEDLINE
6PubMed
7CrossRef
8MEDLINE - Academic
9OpenAIRE (Open Access)
10OpenAIRE
11PubMed Central (Full Participant titles)
12SwePub
13SwePub Articles
14SWEPUB Freely available online
jtitleSystematic biology
delivery
delcategoryRemote Search Resource
fulltextno_fulltext
addata
au
0Valan, Miroslav
1Makonyi, Karoly
2Maki, Atsuto
3Vondráček, Dominik
4Ronquist, Fredrik
formatjournal
genrearticle
ristypeJOUR
atitleAutomated Taxonomic Identification of Insects with Expert-Level Accuracy Using Effective Feature Transfer from Convolutional Networks
jtitleSystematic biology
addtitleSyst Biol
date2019-11-01
risdate2019
volume68
issue6
spage876
epage895
pages876-895
issn
01063-5157
11076-836X
eissn1076-836X
abstractAbstract Rapid and reliable identification of insects is important in many contexts, from the detection of disease vectors and invasive species to the sorting of material from biodiversity inventories. Because of the shortage of adequate expertise, there has long been an interest in developing automated systems for this task. Previous attempts have been based on laborious and complex handcrafted extraction of image features, but in recent years it has been shown that sophisticated convolutional neural networks (CNNs) can learn to extract relevant features automatically, without human intervention. Unfortunately, reaching expert-level accuracy in CNN identifications requires substantial computational power and huge training data sets, which are often not available for taxonomic tasks. This can be addressed using feature transfer: a CNN that has been pretrained on a generic image classification task is exposed to the taxonomic images of interest, and information about its perception of those images is used in training a simpler, dedicated identification system. Here, we develop an effective method of CNN feature transfer, which achieves expert-level accuracy in taxonomic identification of insects with training sets of 100 images or less per category, depending on the nature of data set. Specifically, we extract rich representations of intermediate to high-level image features from the CNN architecture VGG16 pretrained on the ImageNet data set. This information is submitted to a linear support vector machine classifier, which is trained on the target problem. We tested the performance of our approach on two types of challenging taxonomic tasks: 1) identifying insects to higher groups when they are likely to belong to subgroups that have not been seen previously and 2) identifying visually similar species that are difficult to separate even for experts. For the first task, our approach reached $CDATA[$CDATA[$>$$92% accuracy on one data set (884 face images of 11 families of Diptera, all specimens representing unique species), and $CDATA[$CDATA[$>$$96% accuracy on another (2936 dorsal habitus images of 14 families of Coleoptera, over 90% of specimens belonging to unique species). For the second task, our approach outperformed a leading taxonomic expert on one data set (339 images of three species of the Coleoptera genus Oxythyrea; 97% accuracy), and both humans and traditional automated identification systems on another data set (3845 images of nine species of Plecoptera larvae; 98.6 % accuracy). Reanalyzing several biological image identification tasks studied in the recent literature, we show that our approach is broadly applicable and provides significant improvements over previous methods, whether based on dedicated CNNs, CNN feature transfer, or more traditional techniques. Thus, our method, which is easy to apply, can be highly successful in developing automated taxonomic identification systems even when training data sets are small and computational budgets limited. We conclude by briefly discussing some promising CNN-based research directions in morphological systematics opened up by the success of these techniques in providing accurate diagnostic tools.
copEngland
pubOxford University Press
pmid30825372
doi10.1093/sysbio/syz014
oafree_for_read