Modern data systems seek to extract value from data and enable data-driven decisions sweeping all aspects of society, ranging from natural sciences to government to business. Big data involves analyzing massive data volumes and variety of data sources. Doing so effectively needs high quality data to ensure that the analyses and resulting decisions are meaningful and do not fall prey to the garbage in, garbage out (GIGO) syndrome. This course covers big data systems, that is infrastructures that are utilized to handle all steps in typical big data processing pipelines, which include data management and analysis. We introduce data systems for data profiling, repairing inconsistencies in the data, and for analyzing data in the presence of these inconsistencies. We explore system design for turning large scale semistructured and even unstructured data into actionable insights. Students get an experience with big data analysis tools, data stream processing, distributed data platforms, NoSQL and NewSQL technologies.
Tips: you can drag and drop the boxes to clone them to different sections or groups within a prerequisite equation.
Prerequisite Equation
Edit the prerequisite equation to this course:
note: adding data in this section will override data in the Prerequisite List.
+ add prereq equation
Prerequisite List(Overridden by Prerequisite Equation)
Edit the list of prerequisites to this course:
note: data in this section will be override by the Prerequisite Equation if it exists.
+ add new course
Exclusion List
Edit the list of course cerdit exclusions to this course:
+ add new course
Your name (optional):
Any additional comment (optional):
Thank you for your edit suggestion!
Our staff will review and approve it soon.
You can close this page now.
There might have been an error with the server or your input.
Please check your entry and/or try again later.