Ethical Implications of Data Aggregation
Michael McFarland, SJ
One powerful new capability the computer gives us is the ability to compile large amounts of data from disparate sources to create a detailed composite picture of a person or to identify people who meet some criterion or stand out in some way. This has numerous uses and abuses.
One application of this is with what David Burnham calls "transaction data." Now that many ordinary daily activities, such as making a telephone call, purchasing an item with a credit card, and renting a video, are computerized, the details of all of these transactions are recorded and saved. There are legitimate reasons for collecting this information: billing, inventory, predicting future needs and so on. But out of this mass of seemingly innocent details, an enterprising sleuth can assemble a revealing portrait of a person and his or her activities. Phone records can disclose a person's movements, friends, associates, business dealings, preferences and perversions. They could tell us, for example, that Jackie calls his mother every morning, his bookie at noon, and a phone-sex number at least three evenings a week. They can tell us that he was in Atlanta last week, where he spoke with a suspected drug dealer. 1 Credit card records give details on one's travel, taste, habits, schedule and lifestyle. They can tell us that Ophelia likes Polo Sport perfume, Gap jeans, and underwear from Victoria's Secret, that she reads Vanity Fair and Cosmopolitan, that she just bought a new HDTV and that she has been to Paris twice in the past year. And when Robert Bork was nominated for the Supreme Court, we found out what can be learned from records of video rentals.
Online retailers such as Amazon.com collect similar data on their customers. Not only do they collect and store information on customers' purchases, but also on what they looked at on the Web site. This is used to build profiles of their millions of customers, which is then used to make personalized recommendations and place individually targeted ads whenever a customer visits their site. Large data brokers draw on many sources to build massive data bases with detailed records on hundreds of millions of consumers, then sell the information, quite legally, to a wide variety of marketers. One reporter described the largest of these, a little known company in Arkansas named Acxiom, this way: "It peers deeper into American life than the F.B.I. or the I.R.S. or those prying digital eyes at Facebook and Google. If you are an American adult, the odds are that it knows things like your age, race, sex, weight, height, marital status, education level, politics, buying habits, household health worries, vacation dreams – and on and on." 2
Search services like Google, AOL and Yahoo! compile vast amounts of data on the searches of all their visitors. These seemingly innocent little bits of data, when taken together, can be very revealing. From a person's search queries, one could infer, rightly or wrongly, medical and psychological issues, legal problems, employment status, personal interests, sexual activities and preferences, relationships, fantasies, economic circumstances, geographical location and a host of other characteristics. 3 Taken together they can suggest a fairly comprehensive portrait of a person, including that person's most intimate problems and vulnerabilities. It may seem that this information is anonymous for unregistered users, but it is connected to the network address from which the queries come, so it can be traced to a particular computer. Moreover little bits of data called cookies left on the user's computer by other interactive sessions, which can include logins and other identifying information, are often enough to uncover the identity of the user. 4 Even without that information, the pattern of inquiries itself is sometimes enough to narrow down the user's origins and identity, sometimes to a single individual. 5
The data accumulated and stored by Facebook and other social networks can also be very revealing, especially when compiled and analyzed to find patterns and correlations. A great deal can be inferred about a person from his or her associates or "friends" and their circles of "friends," as well as all the thoughts, reflections, activities, commitments, "likes" and other information shared among them. As just one example, a couple of MIT researchers, by analyzing the Facebook profiles of some 4000 students, were able to tell with a fair degree of accuracy (78percent) which ones belonged to gay males. 9 As with sites such as Google and Yahoo!, the myriad bits of information collected by Facebook can be used to compile a detailed and penetrating profile of an individual, one that is if anything even more personal. In the hands of a company that professes a philosophy of "radical transparency," this can leave the subject very vulnerable, not just to having personal information exposed but to being discriminated against because of it. 10
Even information that is accessible to the public, when assembled from different sources into a comprehensive dossier, can create a revealing picture of a person. A simple Google search can turn up an enormous amount of information about an individual, though the accuracy of much of it is questionable. As one researcher put it, "while the quantity of publicly available information about individuals to be found online is vast, it is riddled with inaccuracies." 11 If one is willing to pay, even more is available. A "credit header," which is available for a small fee to anyone who has, or claims to have, a business interest in a person, gives the person's Social Security number, date of birth and list of addresses. With the Social Security number, it is possible then to obtain the person's driving history. For a little more money an investigative service will find the person's criminal history, education and employment record. 12 A careful search of court records, many of them now online, can also reveal the person's history of marriages and divorces, civil suits, property holdings, liens, bankruptcies and so on.
The capability the computer gives of being able to assemble these seemingly innocent and insignificant facts into a comprehensive personal profile and to make it widely available gives that information a different significance. Even though limited groups of people may have legitimate reasons to have access to some of those facts for specific purposes, when the facts are all put together into a dossier they become much more personal and invasive. They thus present many of the dangers of other invasions of privacy. The information can be used for purposes other than those for which it was intended. For example information provided for billing purposes can reveal a persons movements, whereabouts and habits. The subject loses control of who knows what about him or her and what they do with it. And strangers can get a much more intimate look at the person's life than the person would allow if consulted. 13
In addition, when a determined inquirer can get such a comprehensive picture of a person, whether accurate or not, there is an increased danger of misuse, prejudice and discrimination. Often the purpose of such investigations is to make judgments about people. Someone could be denied a mortgage, a job or health insurance because his or her profile fits the pattern of a "high risk" prospect. The person may live in the wrong neighborhood, associate with the wrong people, hang out in the wrong places or have a suspicious history. Certain young children may be judged "at risk" because of the personal profiles the school or the state has developed on them, and placed in school accordingly. That designation could then follow those children through school, denying them the chance to develop normally with their peer group. Whole neighborhoods, families or ethnic groups could also be subject to discrimination because they have "dangerous" profiles, a practice that has come to be known as "Weblining," by analogy with the practice of "redlining," where certain neighborhoods were deemed not eligible for mortgages because of their demographics. 14
Apart from the obvious potential for error and prejudice, this use of profiling is objectionable because it dehumanizes those being judged, as well as those making the judgments. It substitutes calculation for human judgment on what should be very sensitive human issues, and thus treats those profiled as objects, as collections of facts, rather than as persons. 15Michael McFarland, S.J., a computer scientist with extensive liberal arts teaching experience and a special interest in the intersection of technology and ethics, served as the 31st president of the College of the Holy Cross.
1. Burnham, pp. 55-56. Here he describes the government's use of phone records to investigate the rather unusual activities of Billy Carter during his brother's presidency.
2.Natasha Singer, "You for Sale: Mapping, and Sharing, the Consumer Genome," The New York Times, (June 17, 2012), p. BU1.
3. Andrews, op. cit., pp. 26-29.
4. Omer Tene, "What Google Knows: Privacy and Internet Search Engines," Utah Law Review, (2008), pp. 1433-54.
5. Michael Barbaro and Tom Zeller, Jr., "A Face Is Exposed for AOL Searcher No. 4417749," The New York Times, (August 9, 2006), http://www.nytimes.com/2006/08/09/
6. Kevin Bankston of the Electronic Frontier Foundation, quoted in Robert L. Mitchell, "What Google knows about you," Computerworld, (May 11, 2009), http://www.computerworld.com/s/article/
7. Benny Evangelista, "Privacy concerns growing, poll finds," The San Francisco Chronicle, (March 10, 2012), p. D1.
9. Steve Lohr, "How Privacy Vanishes Online," The New York Times, (March 16, 2010), http://www.nytimes.com/2010/03/17
10. Lori Andrews, "Facebook is Using You," The New York Times, (February 4, 2012), http://www.nytimes.com/2012/02/05/opinion/sunday/facebook-is-using-you.html
11. Robert L. Mitchell, "What the Web knows about you: How much private information is available about you in cyberspace? Social Security numbers are just the beginning," Computerworld, (January 27, 2009), http://www.computerworld.com/s/article/
12. Jeffrey Rothfeder, "A Personal Investigation," PC World, 13(11), (November, 1995): 152.
13. Jaikumar Vijayan, "Online Data Broker Spokeo Settles FTC Charges for $800,000," Computerworld, (June12, 2012), http://www.computerworld.com/s/article/9228024/
14. Lori Andrews, I Know Who You Are and I Saw What You Did, op. cit., pp. 20-21.
15. Daniel J. Solove, "Information-Age Privacy Concerns Are More Kafkaesque Than Orwellian," The Chronicle of Higher Education, (December 10, 2004), pp. B6-B8.
Jun 1, 2012
Internet Ethics Stories
Notes from a Content Moderation Conference
The question of content moderation fits squarely in the area of tension between liberty and justice.
A module on ethics for data science courses
Materials include a reading, case studies, and homework assignments
An Ethics Case Study
Should we develop "griefbots"?