Online Public Access Catalogue (OPAC)
Library,Documentation and Information Science Division

“A research journal serves that narrow

borderland which separates the known from the unknown”

-P.C.Mahalanobis


A data scientist's guide to acquiring, cleaning, and managing data in R/ (Record no. 434926)

MARC details
000 -LEADER
fixed length control field 02014nam a22002297a 4500
003 - CONTROL NUMBER IDENTIFIER
control field ISI Library, Kolkata
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20240715160643.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 240715b |||||||| |||| 00| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9781119080022
040 ## - CATALOGING SOURCE
Original cataloging agency ISI Library
Language of cataloging English
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number SA.055
Item number B988
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Buttrey, Samuel E.
Relator term author
245 10 - TITLE STATEMENT
Title A data scientist's guide to acquiring, cleaning, and managing data in R/
Statement of responsibility, etc Samuel E. Buttrey and Lyn R. Whitaker
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication, distribution, etc New Jersey:
Name of publisher, distributor, etc Wiley,
Date of publication, distribution, etc 2018
300 ## - PHYSICAL DESCRIPTION
Extent xxi, 288 pages;
Dimensions 23 cm.
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc Includes bibliography and index
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note R -- R data, part 1: vectors -- r data, part 2: more complicated structures -- R data, part 3: text and factors -- Writing functions and scripts -- Getting data into and out of R -- Data handling in practice -- Extended exercise
520 ## - SUMMARY, ETC.
Summary, etc Every experienced practitioner knows that preparing data for modeling is a painstaking, time-consuming process. Adding to the difficulty is that most modelers learn the steps involved in cleaning and managing data piecemeal, often on the fly, or they develop their own ad hoc methods. This book helps simplify their task by providing a unified, systematic approach to acquiring, modeling, manipulating, cleaning, and maintaining data in R. Starting with the very basics, data scientists Samuel E. Buttrey and Lyn R. Whitaker walk readers through the entire process. From what data looks like and what it should look like, they progress through all the steps involved in getting data ready for modeling. They describe best practices for acquiring data from numerous sources; explore key issues in data handling, including text/regular expressions, big data, parallel processing, merging, matching, and checking for duplicates; and outline highly efficient and reliable techniques for documenting data and recordkeeping, including audit trails, getting data back out of R, and more.
650 0# - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Statistical Computing
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Whitaker, Lyn R.
Relator term author
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme Dewey Decimal Classification
Koha item type Books
Holdings
Lost status Not for loan Home library Current library Date acquired Full call number Accession Number Koha item type Public note
    ISI Library, Kolkata ISI Library, Kolkata 15/07/2024 SA.055 B988 C27492 Books Gifted by Prof. Amita Pal
Library, Documentation and Information Science Division, Indian Statistical Institute, 203 B T Road, Kolkata 700108, INDIA
Phone no. 91-33-2575 2100, Fax no. 91-33-2578 1412, ksatpathy@isical.ac.in