Skip to Main content Skip to Navigation
Book sections

From Conventional Data Analysis Methods to Big Data Analytics

Gilbert Saporta 1
1 CEDRIC - MSDMA - CEDRIC. Méthodes statistiques de data-mining et apprentissage
CEDRIC - Centre d'études et de recherche en informatique et communications
Abstract : Data analysis in this chapter mainly means descriptive and exploratory methods, also known as unsupervised. The objective is to describe as well as structure a set of data that can be represented in the form of a rectangular table crossing n statistical units and p variables. Data analysis methods are essentially dimension reduction methods that are divided into two categories: factor methods; and the unsupervised classification methods or clustering. Data mining is a step in the knowledge discovery process, which involves applying data analysis algorithms. Data mining seeks to find predictive models of a Y denoted response, but from a very different perspective than that of conventional modeling. This chapter distinguishes regression methods where Y is quantitative, supervised classification methods (also called discrimination methods) where Y is categorical, most often with two modalities. The chapter also discusses new tools for big data processing, based on validation with data set aside.
Document type :
Book sections
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal-cnam.archives-ouvertes.fr/hal-02470097
Contributor : Philippe Rigaux <>
Submitted on : Thursday, April 9, 2020 - 5:37:52 PM
Last modification on : Wednesday, April 15, 2020 - 11:24:27 AM

File

04_Chapter 2_ENG_revGSavril202...
Files produced by the author(s)

Identifiers

Collections

Citation

Gilbert Saporta. From Conventional Data Analysis Methods to Big Data Analytics. Big Data for Insurance Companies, John Wiley & Sons, Inc., pp.27-41, 2018, ⟨10.1002/9781119489368.ch2⟩. ⟨hal-02470097⟩

Share

Metrics

Record views

72

Files downloads

70