DSpace Repository

Author Identification and Analysis of Bollywood Song Lyrics

Show simple item record

dc.contributor.author Pai, Deepesh
dc.date.accessioned 2022-03-22T10:31:08Z
dc.date.available 2022-03-22T10:31:08Z
dc.date.issued 2021-07
dc.identifier.citation 29p. en_US
dc.identifier.uri http://hdl.handle.net/10263/7295
dc.description Dissertation under the supervision of Debapriyo Majumdar en_US
dc.description.abstract Research in Natural Language Processing is expanding in multiple domains and applications. With every advancement, the variety of text that are processed is growing. One such domain is lyrics processing. Songs are vital to the music and film industry and are analyzed to get important information such as genre, theme, mood, author, etc. of the song. Bollywood, the Indian film industry makes a lot of revenue making use of songs. The number of songs churned out by this industry is massive and is a rich source of textual data for Natural Language Processing tasks. In the field of Natural Language Processing (NLP) one of the important topics is Authorship identification. Authorship identification is the task of identifying the author of a given text from a set of authors. Authorship identification is applied to tasks such as identifying anonymous authors, detecting plagiarism, or finding ghostwriters. It also gives us an opportunity to work on data in Devanagari which is a relatively less explored field. The main concern of this task is to define an appropriate characterization of texts that captures the writing style of authors. Although deep learning is used in different author identification tasks using LSTM and GRU, it has not been used with BERT(to the best of our knowledge). In this study, the project aims to build a system that can identify the lyricist of a song based on its lyrics. We have built a model based on BERT which would take input the lyrics of a particular song and our program would predict its lyricist based on the content of the lyrics. The results show that the proposed system outperforms its counterparts. en_US
dc.language.iso en en_US
dc.publisher Indian Statistical Institute, Kolkata en_US
dc.relation.ispartofseries Dissertation;;CS1920
dc.subject Natural Language Processing en_US
dc.subject Authorship identification en_US
dc.subject GRU en_US
dc.subject BERT en_US
dc.title Author Identification and Analysis of Bollywood Song Lyrics en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account