Four More Topics

Looks like I am late to the party on the four additional topics. Some of the ones that I have chosen have already been discussed, but here they are:

Creep Factor – I think something on people’s creep factor and how they have changed over time would be helpful for this. Some companies (and people) seem to have some trouble with this and it might be helpful to see where it is now and where it was a long time ago. Also in this section, we could reassure our clients that all of the information that we are gaining is information that the individual has decided to share on their own. It’s not like we’re going through phone records, credit card receipts and the like.

Examples – I am a HUGE proponent of examples, especially when we are talking about things that clearly not a lot of people understand. I was thinking we should have some examples of what is currently being done with open graph. What companies have tried it, how its worked, etc. My thinking is that once people see a real-life example, they may be more willing to listen (and even excited to listen) about the ways we have come up with for them to use the information.

Definition of Permeable Data Sources, Open Graph – I’m not talking about a glossary or anything, but I think explaining what it is we will be referring to for the rest of the package would be helpful.

Moneyball Approach – I was thinking about something like this and I think Jackie summed it up the best so I will defer to her on this one. I know, for me, it was nice reading something that had to do with changing the way things are done.

Other Topics to Include

I like everyone ideas on the additional four chapters. Sometimes procrastination has its advantages. Believe it or not these are the four topics I was going to think of before you all did.

1. Types of data – a explanation of different types of data, why they are important and how each type is used/can be used

2. Competition – how other companies are using this data already, good examples and bad examples of companies

3. Personalization – definitely  important to include, like Jackie said we may want to look at the difference between having the data and using the data effectively, include examples.

4. Future Technology – I agree with Ben it’s necessary to include where big data is going and what could be expected for the future.


Another four topics…

So far, it looks like there are some good ideas out there.  Some of the following ideas may overlap with others that have already been posted, but I think we’ll find a consensus amongst the class once we’ve all posted our ideas.  So here goes:

1. The evolution of sharing (i.e. pushing the creep factor):  While I think many company representatives might be excited about the possibilities of this new technology, some of our data methods may be off-putting.  For me, some context could help alleviate any stress about this.  We’ve talked a lot in class about the creep factor and how social media is constantly pushing the envelope as to what people are willing and not so willing to share in the public domain.  That being said, a quick piece on the historical evolution of more and more sharing might help put those who have a “creep factor” problem at ease.

2. Online Consumer Protections Laws in the U.S.: I think we need to address what laws are being proposed that could affect this project going forward.  While this is a potential downside, say if Europe’s laws prevail as the norm, the recent proposal of guidelines for companies is worthy of discussion here.

3. Online Consumer Data Protections Elsewhere: You can’t talk about U.S. online privacy protections without talking about what other countries are doing in terms of data protection.  Europe, China and India may all be worthy of discussion here.

4. The Importance of Transparency, User Control: Recently, the social networking website Path had a bit of a PR crisis on its hands when users were alerted to the fact that their personal data, from their cell phone address books, were being stored by the company.  This example, and several others are out there as well, provide good context about the importance of creating products for our companies that are transparent in their efforts and offer users the ability to have control of their data.


The previous posts really get at the technological side of our company reports. I’ll try to refrain from redundancy. It is essential to convince our target companies that following our recommendations (effectively utilizing Big Data) will positively alter existing and potential constituencies behavior and interaction with company products.

First and foremost we must show our target companies that WE, the “really gets me” team, really get our respective companies. We should include a segment explaining which constituencies each company currently targets as well as potential consumers.

We should include articles that have taught us the value of eliminating inefficiencies such as the ideas presented in Moneyball as well as methods companies can utilize to inject themselves into the habits of their consumers. Such as:

We should explain and recommend proper usage of permeable data. We should ensure companies that we are not trying to pimp out their target constituencies data, but efficiently use them with all the proper permissions. Ensure that the companies know that users are opting into sharing their data.

Include what companies are already using permeable data for and how they can use it better. I know this is kinda scary we should refrain from telling companies if they suck at using free data or not, but rather suggest methods to which to improve by.


4 Topics

Looking at some of the other posts I think we have a lot of good ideas of how to include more information that will enhance our  report.

4 Topics that have come to my mind for us to include in the report are Opengraph, Types of Data, Data Efficiency and Future Technologies.


-Explanation of what it is and how it can be used for efficiency. An indepth look at Opengraph and why it’s important in data efficiency.

2-Types of Data

-Open and Closed

-Structured and Unstructured


3- Data Efficiency

-The importance of data efficiency. Moneyball and some of the examples we have read about in some of the articles can be     used in this section to reinforce the idea of why data efficiency is critical.

4-Future Technoogies

-What type of technological advancements or capabilities should be be expected in data or  new media technology in the         future.

Topics to Explore

Here are a few topics we should consider exploring for the white-paper. Each one of these subjects should warrant a good bit of individual attention, minimizing overlap .

Social Capital

How do we measure the value of social relationships?


“In The Forms of Capital, Pierre Bourdieu distinguishes between three forms of capital: economic capital, cultural capital and social capital. He defines social capital as “the aggregate of the actual or potential resources which are linked to possession of a durable network of more or less institutionalized relationships of mutual acquaintance and recognition.” His treatment of the concept is instrumental, focusing on the advantages to possessors of social capital and the ‘deliberate construction of sociability for the purpose of creating this resource.'”


“There is no widely held consensus on how to measure social capital, which has become a debate in itself: why refer to this phenomenon as ‘capital’ if there is no true way to measure it?”

Sentiment Analysis

How do we extract subjective information from source materials?

Importance, Accuracy, Applications

Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.

Opinion mining from noisy text data
Twitter mood maps reveal emotional states of America
Automatic Identification of Pro and Con Reasons in Online Reviews
Opinion Mining and Sentiment Analysis


What types of data-sets are out there?

Data, data everywhere: Information has gone from scarce to superabundant.
That brings huge new benefits, says Kenneth Cukier, but also big headaches


How do we efficiently process large quantities of data within tolerable elapsed times?

Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.
Facebook handles 40 billion photos from its user base.
Decoding the human genome originally took 10 years to process; now it can be achieved in one week.

Sandia sees data management challenges spiral
Community cleverness required


Is this crowd-sourcing thing still worth the buzz?

Crowd Sourcing Turns Business On Its Head
Looking Forward – Emerging and Declining Networks for 2009
Crowdsourcing without a Crowd: Levia’s Failed Attempt

Permeable Data Sources 

Who has the keys to the lock box?

Dispatch Box: On the road to Open Data
Open Government Data Catalogues 
Protocol for Implementing Open Access Data

New concepts:

time banking

social gaming


9 types of business models: Which one are you?

And Still More Chapter Topics

So a lot has already been covered, and I think all of the previously mentioned topics are great candidates for articles. Here’s a few more to add into the mix:

An article about how other companies are using Opengraph technology: For those companies that err on the side of caution when it comes to Big Data, it might be beneficial to show them what other companies are doing to take Big Data to the next level.

An article about the possible online privacy bill of rights currently brewing in the White House: This article would show how Big Data retains the anonymity of participants while collecting information that benefits the companies that use it. It would also show the difference between aggregate data, which is perfectly acceptable to use, versus identifiable data, which most likely would violate the online privacy bill of rights.

I absolutely agree with Jackie’s opinion to include Moneyball as a chapter. Getting companies to understand our definition of efficiency is crucial to the client recruitment package as a whole.

Kelsi also had a great point about including the regression equation in order to figure out a company’s y. I’m not sure if it should be in our individual chapters or in the mini-articles at the beginning, but regardless, this information should definitely be included.

My personal opinion is that we need to include some of the assumed pitfalls of Big Data. If it has an reputation, we need to address it head-on in order to ease future clients’ concerns. Show them the their competitors using it. Show them the controversy, and then diffuse it with the true facts surrounding Big Data. Our opening articles should be informative but should also advocate the technology we’re trying to sell. If the clients are comfortable and they like what we tell them, why wouldn’t they go with us?

More chapter topics

I like Jackie’s ideas for the other four chapter topics…and here are a few more.

Expanding on the personalization topic she talked about, we could also include how personalization versus aggregate data is important when it comes to the consumers.


  • how it works
  • what the information pulled from it is being used for
  • Facebook Connect

Facebook privacy (Dr. Shamp touched on this in his regulation section)

  • where your data is going
  • US vs. Europe
  • the creep factor and how RGM can avoid being creepy

Company regression equations

  • Finding out what the “Y” of your company is
  • How to get the most out of data to achieve this goal


An extra four…

Explanation of Data

  • Structured/Unstructured
  • Open/Closed
  • Permeable
  • and how and where Facebook & the Open graph fit in the mix


  • the difference between just having data and using that data
  • giving examples- such as the use of Triggers, previous searches, using information voluntarily dispersed on Facebook such as location, occupations, likes, favorites, etc


  • What the Facebook IPO means for the REALLY GETS ME movement


  • Using Moneyball as the prime example
  • Spelling out the Really Gets Me
  • Making the Data work

Taking Datamining a Step Further

Hey guys,

I was on CNN’s tech site and found this article to be particularly interesting:

Manage and (sell?) your data online

This could work, assuming we as consumers know what we want…