TY - GEN
T1 - Knowledge Graph Fusion for Language Model Fine-Tuning
AU - Bhana, Nimesh
AU - Van Zyl, Terence L.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Language Models such as BERT (Bidirectional Encoder Representations from Transformers) have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a side-effect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.
AB - Language Models such as BERT (Bidirectional Encoder Representations from Transformers) have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a side-effect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.
KW - BERT
KW - Knowledge Graph
KW - Language Model
UR - https://www.scopus.com/pages/publications/85151672939
U2 - 10.1109/ISCMI56532.2022.10068451
DO - 10.1109/ISCMI56532.2022.10068451
M3 - Conference contribution
AN - SCOPUS:85151672939
T3 - 2022 9th International Conference on Soft Computing and Machine Intelligence, ISCMI 2022
SP - 167
EP - 172
BT - 2022 9th International Conference on Soft Computing and Machine Intelligence, ISCMI 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th International Conference on Soft Computing and Machine Intelligence, ISCMI 2022
Y2 - 26 November 2022 through 27 November 2022
ER -