Please use this identifier to cite or link to this item:
http://hdl.handle.net/10263/7246
Title: | On coreset construction for K-means clustering of flats and hyperplanes |
Authors: | Mukherjee, Abhisek |
Keywords: | The minimum enclosing ball (MEB) D2 sampling |
Issue Date: | Jun-2020 |
Publisher: | Indian Statistical Institute, Kolkata |
Citation: | 37p. |
Series/Report no.: | Dissertation;;2020;32 |
Abstract: | Coreset is an important tool to effectively extract information from large amount of data by sampling only a few elements from it, without any substantial loss of the actual information. An -coreset is defined as a weighted set C obtained from an universe X, so that for any solution set Q for a problem (referred to as a query in coreset literature), jCost(X;Q) Cost(C;Q)j Cost(X;Q). Our work is an attempt to generalize the solution provided in the paper “k- Means Clustering of Lines for Big Data” (Marom and Feldman, NIPS, 2019), and explore if it is possible to extend to k-flats in Rd as well. Following the approach used in the paper mentioned, we will attempt at building a deterministic algorithm to compute an -coreset whose size is near logarithmic of the input size for a j-dimensional affine subspace in Rd. |
Description: | Dissertation under the supervision of Dr. Arijit Ghosh & Dr. Arijit Bishnu |
URI: | http://hdl.handle.net/10263/7246 |
Appears in Collections: | Dissertations - M Tech (CS) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Abhishek-Mukherjee-2018-20-Dissertation.pdf | 534.13 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.