Exponential data volumes need to be harmonized due to data diversity and difficulties faced in terms of processing (format, quality, origin, etc.).
The aim of processing these data is to convert them from a raw state of variable quality to a standardized, usable state.
Matrix and multi-dimensional representation
Historization & description
Box Plots, Stats
extraction, separation, agregation
Dictionaries, vocabularies, joins, links, geocoding
#ETL #ELT #Hadoop #Spark #LinkedData #MapReduce #DataScience
This post is also available in: