Fundamental Concepts of Data Processing

Discussion 2: Fundamental Concepts of Data Processing (10%)
This assignment relates to the following Course Learning Requirements:
CLR2 – Explain the concepts of the Business Intelligence infrastructure
CLR3 – Explain the relationship between Big Data and business intelligence and identify key technologies
CLR4 – Explain the fundamental concepts of data mining and data analytics
Objective of this Assignment
-Identify a data processing system that addresses business demands.
-For this assignment, read the following scenario:
And you quote . . .
“When you are dealing with a torrent of data, being able to construct different levels of aggregation is really important, because you want to make sure you’ve used your software intelligently with respect to that large amount of data, and bring to bear the contextual data you need to answer the question at the level of aggregation that you need to answer it.”
—Michael O’Connell, Senior Director of Analytics, TIBCO—
“What the . . .?” your boss growls. “I’m not sure what all that means, but here’s my reality. I make decisions in real-time. If I’m ever late the results are catastrophic for our clients and for us – too horrible even to consider. The same applies if I’m wrong. So, tell me, ‘Oh great and powerful BIDA wizard.’ How is your – what did you call it, ‘distributed data processing system’ – going to make sure I’m always right and always on time?”
For discussion:
– How will you explain to your boss how the SCV principle applies to their demands.
– Describe a data processing model that will not address your boss’s requirements for both speed and accuracy?
– If there is such a thing, then describe a model that will address your boss’s requirements (and explain how you will spend the bonus you will receive upon its successful implementation).
Discussion 3: Identify Patterns in Data (10%)
This assignment relates to the following Course Learning Requirements:
CLR4 – Explain the fundamental concepts of data mining and data analytics
CLR7 – Manipulate large data sets and solve simple statistical and mathematical problems in order to support data analysis
CLR8 – Conduct elementary data mining to identify patterns in data
Objective of this Assignment
– Explain the methods and analysis to be used to test a hypothesis.
For this assignment, read the following scenario:
– You tell your boss, “I’ve been working in the business intelligence and data analytics field for years. I know what I’m talking about. All of our competitors train the elves on their analytics teams. We give ours pixie dust. If we train our elves, then we will increase our data processing capacity by thirteen percent.”
Note: Pixie dust is to elves as caffeine is to humans, a stimulant. Training is to elves as it is to humans, a way to enhance skills.
Your boss thinks they are playing devil’s advocate (. . .you know for a fact they really are the devil’s advocate. . .) and replies, “ Well, I know a thing or two myself, ‘wizardly one.’ Elves are smart. They don’t need training. If we do what you propose, then our data processing capacity will not improve by one iota, let alone thirteen percent. Prove me wrong.”
“Challenge accepted,” you declare, march out of the ivory tower and straight to your grotto, which is crammed with techno-computing junk, plop yourself down on your thrown made of melted hard drives, and get to thinking about how you will show your boss, yet again, you are not, not right.
For discussion:
– Your search of open and commercial data sources comes up empty. There are no data extant that connect pixie dust and/or training to elven data processing efficiency. You have no choice but to conduct your own research on the one-hundred elves who work on company’s analytics team, which means you will need to answer some important questions.
– What sampling approach will you employ and how will you do it.
– What type of dataset will host your data and what will it look like.
– What type of analysis will you conduct and how will you present the results?
– What do you need to learn from the research?