{"corpus_id":270341565,"paper_sha":"23199e267e6755df601b79407380e6e625f5e155","doi":"10.1111/coin.12656","arxiv_id":null,"pmid":null,"pmcid":null,"mag_id":null,"dblp_id":"journals/ci/PanZLLPHH24","acl_id":null,"title":"Utilizing passage‐level relevance and kernel pooling for enhancing BERT‐based document reranking","year":2024,"publication_date":"2024-06-01","venue":"International Conference on Climate Informatics","journal":{"name":"Computational Intelligence","pages":null,"volume":"40"},"journal_issn":null,"journal_title":null,"publication_types":["JournalArticle"],"pubmed_pub_types":null,"s2_fields_of_study":["Computer Science"],"reference_count":71,"citation_count":3,"influential_citation_count":0,"is_open_access":true,"arxiv_categories":null,"arxiv_license":null,"arxiv_journal_ref":null,"mesh_headings":null,"chemicals":null,"comments_corrections":null,"source_flags":1,"s2_open_access_pdf_url":"https://onlinelibrary.wiley.com/doi/pdfdirect/10.1111/coin.12656","s2_open_access_landing_url":"https://www.semanticscholar.org/paper/23199e267e6755df601b79407380e6e625f5e155","s2_open_access_license":"CCBY","s2_open_access_status":"HYBRID","pmc_open_access_pdf_url":null,"pmc_open_access_landing_url":null,"pmc_open_access_license":null,"pmc_open_access_status":null,"unpaywall_open_access_pdf_url":null,"unpaywall_open_access_landing_url":null,"unpaywall_open_access_license":null,"unpaywall_open_access_status":null,"abstract":"The pre‐trained language model (PLM) based on the Transformer encoder, namely BERT, has achieved state‐of‐the‐art results in the field of Information Retrieval. Existing BERT‐based ranking models divide documents into passages and aggregate passage‐level relevance to rank the document list. However, these common score aggregation strategies cannot capture important semantic information such as document structure and have not been extensively studied. In this article, we propose a novel kernel‐based score pooling system to capture document‐level relevance by aggregating passage‐level relevance. In particular, we propose and study several representative kernel pooling functions and several different document ranking strategies based on passage‐level relevance. Our proposed framework KnBERT naturally incorporates kernel functions from the passage level into the BERT‐based re‐ranking method, which provides a promising avenue for building universal retrieval‐then‐rerank information retrieval systems. Experiments conducted on two widely used TREC Robust04 and GOV2 test datasets show that the KnBERT has made significant improvements over other BERT‐based ranking approaches in terms of MAP, P@20, and NDCG@20 indicators with no extra or even less computations.","claims":[{"public_id":"cl_3f42ebbbf41a12c33ce3461222a33cdc","status":"active","text":"KnBERT incorporates kernel functions from the passage level into the BERT-based re-ranking method.","confidence":0.95,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_3f42ebbbf41a12c33ce3461222a33cdc"},{"public_id":"cl_331dfd286d4775aa3f5e6e02e47e9354","status":"active","text":"KnBERT is a novel kernel-based score pooling system that captures document-level relevance by aggregating passage-level relevance, improving BERT-based document reranking.","confidence":0.95,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_331dfd286d4775aa3f5e6e02e47e9354"},{"public_id":"cl_c4ddb74207fb03e970a8c13cca7bebe2","status":"active","text":"On TREC Robust04 and GOV2, KnBERT achieves significant improvements over other BERT-based ranking approaches in MAP, P@20, and NDCG@20 with no extra or even less computations.","confidence":0.95,"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/claims/cl_c4ddb74207fb03e970a8c13cca7bebe2"}],"concepts":[{"public_id":"co_00644e7f0b3f0e350dc6efb6e6ae322d","status":"active","name":"TREC Robust04","description":"A widely used test dataset for information retrieval evaluation, employed in the experiments.","types":["dataset"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_00644e7f0b3f0e350dc6efb6e6ae322d"},{"public_id":"co_1395fbe7cc05b16f69182483ff956445","status":"active","name":"GOV2","description":"A widely used test dataset for information retrieval evaluation, employed in the experiments.","types":["dataset"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_1395fbe7cc05b16f69182483ff956445"},{"public_id":"co_428c858fdc706ca7ede8024b7c524823","status":"active","name":"kernel pooling functions","description":"Several representative kernel functions used within KnBERT to aggregate passage-level relevance scores into document-level relevance.","types":["method","function"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_428c858fdc706ca7ede8024b7c524823"},{"public_id":"co_85271ba61bb3496e8127f05e13c7ef47","status":"active","name":"passage-level relevance","description":"Relevance scores computed for each passage of a document in BERT-based ranking models.","types":["measurement","concept"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_85271ba61bb3496e8127f05e13c7ef47"},{"public_id":"co_8f21847fc882ff6834f3b13f64193281","status":"active","name":"NDCG@20","description":"Normalized Discounted Cumulative Gain at rank 20, an evaluation metric reported for the experiments.","types":["metric"],"aliases":["normalized discounted cumulative gain at 20"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_8f21847fc882ff6834f3b13f64193281"},{"public_id":"co_8ff3feafdd524a364b4cdec5f92b1233","status":"active","name":"document-level relevance","description":"The overall relevance of a document obtained by aggregating passage-level relevance scores, captured by KnBERT.","types":["measurement","concept"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_8ff3feafdd524a364b4cdec5f92b1233"},{"public_id":"co_a3b748a359975b3f44eba6d329bac67f","status":"active","name":"BERT-based ranking models","description":"Existing ranking models that divide documents into passages and aggregate passage-level relevance scores.","types":["method","model"],"aliases":[],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_a3b748a359975b3f44eba6d329bac67f"},{"public_id":"co_bcd5f8509ff9cb2c50b78423470fd47d","status":"active","name":"KnBERT","description":"A novel kernel-based score pooling framework that integrates kernel functions at the passage level into BERT-based reranking to capture document-level relevance.","types":["method","framework"],"aliases":["kernel-based score pooling system"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_bcd5f8509ff9cb2c50b78423470fd47d"},{"public_id":"co_d50f0a9f9102ef5464fa8bf9daa82703","status":"active","name":"MAP","description":"Mean Average Precision, an evaluation metric reported for the experiments.","types":["metric"],"aliases":["mean average precision"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_d50f0a9f9102ef5464fa8bf9daa82703"},{"public_id":"co_fb0432717d6bea29401d9455cf9eb315","status":"active","name":"P@20","description":"Precision at rank 20, an evaluation metric reported for the experiments.","types":["metric"],"aliases":["precision at 20"],"contributors":[{"id":1165,"public_id":"ezd9qvkvax","public_label":"The Reverser‮ (ezd9qvkvax)","roles":["extraction"],"url":"https://sah.borca.ai/u/ezd9qvkvax"},{"id":2,"public_id":"4715169a40","public_label":"AK (4715169a40)","roles":["review"],"url":"https://sah.borca.ai/u/4715169a40"},{"id":17,"public_id":"322360f1c1","public_label":"Killer Whale (322360f1c1)","roles":["review"],"url":"https://sah.borca.ai/u/322360f1c1"}],"url":"https://sah.borca.ai/concepts/co_fb0432717d6bea29401d9455cf9eb315"}],"external_ids":{"DOI":"10.1111/coin.12656","ArXiv":null,"PubMed":null,"PubMedCentral":null,"MAG":null,"DBLP":"journals/ci/PanZLLPHH24","ACL":null},"open_access":{"is_open_access":true,"pdf_url":"https://onlinelibrary.wiley.com/doi/pdfdirect/10.1111/coin.12656","landing_url":"https://www.semanticscholar.org/paper/23199e267e6755df601b79407380e6e625f5e155","source":"semantic_scholar","pdf_url_source":"semantic_scholar_open_access_pdf","license":"CCBY","status":"HYBRID","reason":null},"reference_availability":{"status":"available","references_indexed":true,"full_text_available":false,"full_text_source":null,"count_basis":"semantic_scholar_metadata","extraction_status":"not_applicable","reason":null},"source":{"provider":"episteme2","base_corpus":"semantic_scholar_dump","freshness_mode":"unknown","basis":["semantic_scholar_metadata","postgres_metadata"],"limits":["paper metadata is based on indexed upstream scholarly datasets","claims and concepts are available only for extracted papers","absence of claims or concepts means no extracted graph data is available in this response"],"status":"available","degraded":false,"degraded_reasons":[],"diagnostics":{"status":"available","degraded":false,"degraded_reasons":[],"metadata_status":"available","graph_status":"available","abstract_status":"available"},"source_flags":1},"paper_id":630627,"paper_uid":"f2dc770b-20f4-4ee0-8623-fed6fc13920f","canonical_identity":{"paper_id":630627,"paper_uid":"f2dc770b-20f4-4ee0-8623-fed6fc13920f","identity_status":"available","lookup_basis":"semantic_scholar_external_id","compatibility_path":"corpus_id"},"url":"https://sah.borca.ai/papers/270341565"}