af carlos bernal 1 år siden
89
Mere som dette
Refers to the quality of the data that is being analyzed.
Data from a medical experiment or trial.
Noise.
Contains a high percentage of meaningless data.
Has many records that are valuable to analyze and that contribute in a meaningful way to the overall results.
Is as big the variety that generally is one out of three types: structured, semi structured and unstructured data wich frequently requires distinct processing capabilities and specialist algorithms
EXAMPLE:
CCTV audio and video files that are generated at various locations in a city.
Speed with which data is generated with such a pace that requires distinct (distributed) processing techniques.
Twitter messages or Facebook posts.
Size of the data sets that need to be analyzed and processed
All credit card transactions on a day within Europe.