1. Your CEO wants your BigData database platform to be able to log and store every click made on every page from every consumer when they interact with the company’s website, plus some additional metadata about each click (data about the data). Based on this scenario, which “V’s” of BigData are relevant here? 2. Your CMO wants to test the response to a new Kimchi recipe specifically designed to appeal to the Korean community. He asks you to run a test and measure response based on a sample of 500 people. He suggests the first 500 people that have a last name of length of more than 5 letters be included in the test, and anyone with a last name shorter than 5 letters be excluded. Is this a good way assemble a sample to test response to the product? Why or why not? 3. In a meeting, a Senior Manager says that a “negative correlation” doesn’t sound like a desirable result, and a “perfect correlation” sounds like a good thing. So he is confused about the meaning of a “perfectly negative correlation”. How would you respond? 4. You work for an airline and the customer database has information on Relationship Status as well as amount paid for recent travel tickets. Relationship status is coded as: 0:Single, 1:Married, 2: Divorced, 3:Widowed, 4:Commited, 5:It’s Complicated. Amount spent is in the database as actual dollars paid. Your Chief Marketing Officer wants to determine whether there is a correlation between amount spent and the customer’s relationship status. She asks you to calculate the correlation between relationship-status and amount spent. Is this a good idea, or not? Explain. 5. You are in a meeting with the CMO of a major retailer discussing a simple regression model to predict Amount_Spent (Y) from Age (X). The model has a high r-square value, and everything about the final model is statistically significant. The coefficient for AGE is 0.00000001234. The CMO points out that the coefficient for AGE is almost zero, and anything multiplied by zero is zero; So, AGE really doesn’t matter in the model. Agree? Disagree? How would you respond?

  1. Your CEO wants your BigData database platform to be able to log and store every click made on every page from every consumer when they interact with the company’s website, plus some additional metadata about each click (data about the data). Based on this scenario, which “V’s” of BigData are relevant here?
  2. Your CMO wants to test the response to a new Kimchi recipe specifically designed to appeal to the Korean community. He asks you to run a test and measure response based on a sample of 500 people. He suggests the first 500 people that have a last name of length of more than 5 letters be included in the test, and anyone with a last name shorter than 5 letters be excluded. Is this a good way assemble a sample to test response to the product? Why or why not?
  3. In a meeting, a Senior Manager says that a “negative correlation” doesn’t sound like a desirable result, and a “perfect correlation” sounds like a good thing. So he is confused about the meaning of a “perfectly negative correlation”. How would you respond?
  4. You work for an airline and the customer database has information on Relationship Status as well as amount paid for recent travel tickets. Relationship status is coded as: 0:Single, 1:Married, 2: Divorced, 3:Widowed, 4:Commited, 5:It’s Complicated. Amount spent is in the database as actual dollars paid. Your Chief Marketing Officer wants to determine whether there is a correlation between amount spent and the customer’s relationship status. She asks you to calculate the correlation between relationship-status and amount spent. Is this a good idea, or not? Explain.
  5. You are in a meeting with the CMO of a major retailer discussing a simple regression model to predict Amount_Spent (Y) from Age (X). The model has a high r-square value, and everything about the final model is statistically significant. The coefficient for AGE is 0.00000001234. The CMO points out that the coefficient for AGE is almost zero, and anything multiplied by zero is zero; So, AGE really doesn’t matter in the model. Agree? Disagree? How would you respond?