{"corpus_id":249612542,"paper_sha":"f5c5d39318a97301a8d33b41da7ff2c4ead7a4df","doi":"10.3390/rs11171996","arxiv_id":null,"pmid":null,"pmcid":null,"mag_id":2970105043,"dblp_id":"journals/remotesensing/ZhuYML19","acl_id":null,"title":"AttentionBased Deep Feature Fusion for the Scene Classification of HighResolution Remote Sensing Images","year":2019,"publication_date":"2019-08-23","venue":"Remote Sensing","journal":{"name":"Remote. Sens.","pages":"1996","volume":"11"},"journal_issn":null,"journal_title":null,"publication_types":["JournalArticle"],"pubmed_pub_types":null,"s2_fields_of_study":["Computer Science","Environmental Science"],"reference_count":71,"citation_count":27,"influential_citation_count":2,"is_open_access":true,"arxiv_categories":null,"arxiv_license":null,"arxiv_journal_ref":null,"mesh_headings":null,"chemicals":null,"comments_corrections":null,"source_flags":1,"s2_open_access_pdf_url":"https://www.mdpi.com/2072-4292/11/17/1996/pdf?version=1582620349","s2_open_access_landing_url":"https://www.semanticscholar.org/paper/f5c5d39318a97301a8d33b41da7ff2c4ead7a4df","s2_open_access_license":"CCBY","s2_open_access_status":"GOLD","pmc_open_access_pdf_url":null,"pmc_open_access_landing_url":null,"pmc_open_access_license":null,"pmc_open_access_status":null,"unpaywall_open_access_pdf_url":null,"unpaywall_open_access_landing_url":null,"unpaywall_open_access_license":null,"unpaywall_open_access_status":null,"abstract":"Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.","claims":[{"public_id":"cl_4165dcdbcc8ff6b8f67e6ea43182a9fb","status":"active","text":"The ADFF framework employs a center-based cross-entropy loss that combines cross-entropy loss and center loss to reduce intraclass diversity effects.","confidence":0.9,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_4165dcdbcc8ff6b8f67e6ea43182a9fb"},{"public_id":"cl_531e19d57579a8abfa4874ec4e09ecd0","status":"active","text":"The ADFF framework fuses deep features from original images and attention maps via multiplicative fusion to improve discrimination of repeated texture and salient regions.","confidence":0.9,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_531e19d57579a8abfa4874ec4e09ecd0"},{"public_id":"cl_e1909f3616816dff3fa1a2a6922758ad","status":"active","text":"The ADFF framework uses GradCAM-generated attention maps as explicit input to force the network to focus on discriminative regions.","confidence":0.9,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_e1909f3616816dff3fa1a2a6922758ad"},{"public_id":"cl_31d8aa508e67bdcd4bc1c6a32a836bf3","status":"active","text":"The ADFF method achieves an average overall accuracy of 94% on three benchmark datasets under different training ratios, outperforming most competitive scene classification methods.","confidence":0.95,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_31d8aa508e67bdcd4bc1c6a32a836bf3"}],"concepts":[{"public_id":"co_3b29519462a029db3ff6fa1f31e43651","status":"active","name":"training ratios","description":"Different proportions of training data used in the experiments to assess performance under varying conditions.","types":["evaluation setting"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_3b29519462a029db3ff6fa1f31e43651"},{"public_id":"co_454de42dd6a03e1d9775c59336af0354","status":"active","name":"three benchmark datasets","description":"The evaluation datasets used to test the ADFF method.","types":["dataset"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_454de42dd6a03e1d9775c59336af0354"},{"public_id":"co_649d8d6bc401a119de0ca68791dc0b98","status":"active","name":"center loss","description":"A loss term that minimizes intraclass variance by pulling feature representations toward class centers.","types":["loss function"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_649d8d6bc401a119de0ca68791dc0b98"},{"public_id":"co_6667fcdbe8d5986dc02ad7d63ebf73d8","status":"active","name":"deep features","description":"Feature representations extracted by a convolutional neural network from the input images.","types":["feature"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_6667fcdbe8d5986dc02ad7d63ebf73d8"},{"public_id":"co_93338be78ec6758410bc6afc79cf3b28","status":"active","name":"intraclass diversity","description":"Variability among samples of the same class, which the proposed loss function aims to reduce.","types":["phenomenon"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_93338be78ec6758410bc6afc79cf3b28"},{"public_id":"co_9edf96212d6af3e1ee204c45c5855492","status":"active","name":"attention-based deep feature fusion","description":"The proposed framework for scene classification that integrates attention maps, multiplicative feature fusion, and a center-based cross-entropy loss.","types":["framework"],"aliases":["ADFF","ADFF framework"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_9edf96212d6af3e1ee204c45c5855492"},{"public_id":"co_a5d97ff86531dad876143937ac8fc56e","status":"active","name":"center-based cross-entropy loss function","description":"A loss function that combines cross-entropy loss and center loss to reduce intraclass diversity.","types":["loss function"],"aliases":["center-based cross-entropy loss"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_a5d97ff86531dad876143937ac8fc56e"},{"public_id":"co_af484cb51d3510ec3595c8cc7124e748","status":"active","name":"multiplicative fusion","description":"The operation that fuses deep features from original images and attention maps by element-wise multiplication.","types":["method"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_af484cb51d3510ec3595c8cc7124e748"},{"public_id":"co_cad05dc80f7deff586b420cb8cf15475","status":"active","name":"Gradient-weighted Class Activation Mapping (GradCAM)","description":"A technique used to generate attention maps that highlight discriminative regions in images.","types":["method"],"aliases":["GradCAM"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_cad05dc80f7deff586b420cb8cf15475"},{"public_id":"co_f07635b1676fe069fb30589ef69bfaa0","status":"active","name":"cross-entropy loss","description":"A standard classification loss used as one component of the combined loss.","types":["loss function"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_f07635b1676fe069fb30589ef69bfaa0"},{"public_id":"co_f4970ea614c296e6401e18c45080acfc","status":"active","name":"attention maps","description":"Visual maps generated by GradCAM that indicate discriminative regions, used as explicit input to the network.","types":["data","input"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_f4970ea614c296e6401e18c45080acfc"},{"public_id":"co_f9c7681830ca905f8b488515f408654e","status":"active","name":"overall accuracy","description":"The average classification accuracy reported as 94% across three benchmark datasets.","types":["metric"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_f9c7681830ca905f8b488515f408654e"}],"external_ids":{"DOI":"10.3390/rs11171996","ArXiv":null,"PubMed":null,"PubMedCentral":null,"MAG":2970105043,"DBLP":"journals/remotesensing/ZhuYML19","ACL":null},"open_access":{"is_open_access":true,"pdf_url":"https://www.mdpi.com/2072-4292/11/17/1996/pdf?version=1582620349","landing_url":"https://www.semanticscholar.org/paper/f5c5d39318a97301a8d33b41da7ff2c4ead7a4df","source":"semantic_scholar","pdf_url_source":"semantic_scholar_open_access_pdf","license":"CCBY","status":"GOLD","reason":null},"reference_availability":{"status":"available","references_indexed":true,"full_text_available":false,"full_text_source":null,"count_basis":"semantic_scholar_metadata","extraction_status":"not_applicable","reason":null},"source":{"provider":"episteme2","base_corpus":"semantic_scholar_dump","freshness_mode":"unknown","basis":["semantic_scholar_metadata","postgres_metadata"],"limits":["paper metadata is based on indexed upstream scholarly datasets","claims and concepts are available only for extracted papers","absence of claims or concepts means no extracted graph data is available in this response"],"status":"available","degraded":false,"degraded_reasons":[],"diagnostics":{"status":"available","degraded":false,"degraded_reasons":[],"metadata_status":"available","graph_status":"available","abstract_status":"available"},"source_flags":1},"paper_id":630934,"paper_uid":"0166a44f-5a2d-41ca-bb56-6b909341cdec","canonical_identity":{"paper_id":630934,"paper_uid":"0166a44f-5a2d-41ca-bb56-6b909341cdec","identity_status":"available","lookup_basis":"semantic_scholar_external_id","compatibility_path":"corpus_id"},"url":"https://sah.borca.ai/papers/249612542"}