dc.contributor.author |
Pai, Deepesh |
|
dc.date.accessioned |
2022-03-22T10:31:08Z |
|
dc.date.available |
2022-03-22T10:31:08Z |
|
dc.date.issued |
2021-07 |
|
dc.identifier.citation |
29p. |
en_US |
dc.identifier.uri |
http://hdl.handle.net/10263/7295 |
|
dc.description |
Dissertation under the supervision of Debapriyo Majumdar |
en_US |
dc.description.abstract |
Research in Natural Language Processing is expanding in multiple domains and
applications. With every advancement, the variety of text that are processed is growing.
One such domain is lyrics processing.
Songs are vital to the music and film industry and are analyzed to get important
information such as genre, theme, mood, author, etc. of the song. Bollywood, the
Indian film industry makes a lot of revenue making use of songs. The number of
songs churned out by this industry is massive and is a rich source of textual data
for Natural Language Processing tasks. In the field of Natural Language Processing
(NLP) one of the important topics is Authorship identification.
Authorship identification is the task of identifying the author of a given text from
a set of authors. Authorship identification is applied to tasks such as identifying
anonymous authors, detecting plagiarism, or finding ghostwriters. It also gives us
an opportunity to work on data in Devanagari which is a relatively less explored field.
The main concern of this task is to define an appropriate characterization of texts
that captures the writing style of authors. Although deep learning is used in different
author identification tasks using LSTM and GRU, it has not been used with BERT(to
the best of our knowledge). In this study, the project aims to build a system that can
identify the lyricist of a song based on its lyrics. We have built a model based on
BERT which would take input the lyrics of a particular song and our program would
predict its lyricist based on the content of the lyrics. The results show that the proposed
system outperforms its counterparts. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Indian Statistical Institute, Kolkata |
en_US |
dc.relation.ispartofseries |
Dissertation;;CS1920 |
|
dc.subject |
Natural Language Processing |
en_US |
dc.subject |
Authorship identification |
en_US |
dc.subject |
GRU |
en_US |
dc.subject |
BERT |
en_US |
dc.title |
Author Identification and Analysis of Bollywood Song Lyrics |
en_US |
dc.type |
Other |
en_US |