Share icon

Case Studies: High-Profile Cases of Privacy Violation

Contributor.

Smith Gambrell & Russell weblink

Case Studies: Recent FTC Enforcement Actions - High-Profile Cases of Privacy Violation: Uber, Emp Media, Lenovo, Vizio, VTech, LabMD

Uber Technologies

The scenario: In August 2018, the FTC announced an expanded settlement with Uber Technologies for its alleged failure to reasonably secure sensitive data in the cloud, resulting in a data breach of 600,000 names and driver's license numbers, 22 million names and phone numbers, and more than 25 million names and email addresses.

The settlement: The expanded settlement is a result of Uber's failure to disclose a significant data breach that occurred in 2016 while the FTC was conducting its investigation that led to the original settlement. The revised proposed order includes provisions requiring Uber to disclose any future consumer data breaches, submit all reports for third-party audits of Uber's privacy policy and retain reports on unauthorized access to consumer data. 2

Emp Media Inc. (Myex.com)

The scenario: The FTC joined forces with the State of Nevada to address privacy issues arising from the "revenge" pornography website, Myex.com, run by Emp Media Inc. The website allowed individuals to submit intimate photos of the victims, including personal information such as name, address, phone number and social media accounts. If a victim wanted their photos and information removed from the website, the defendants reportedly charged fees of $499 to $2,800 to do so.

The settlement: On June 15, 2018, the enforcement action brought by the FTC led to a shutdown of the website and permanently prohibited the defendants from posting intimate photos and personal information of other individuals without their consent. The defendants were also ordered to pay more than $2 million. 3

Lenovo and Vizio

The scenario: In 2018, FTC enforcement actions led to large settlements with technology manufacturers Lenovo and Vizio. The Lenovo settlement related to allegations the company sold computers in the U.S. with pre-installed software that sent consumer information to third parties without the knowledge of the users. With the New Jersey Office of Attorney General, the FTC also brought an enforcement action against Vizio, a manufacturer of "smart" televisions. Vizio entered into a settlement to resolve allegations it installed software on its televisions to collect consumer data without the knowledge or consent of consumers and sold the data to third parties.

The settlement: Lenovo entered into a consent agreement to resolve the allegations through a decision and order issued by the FTC. The company was ordered to obtain affirmative consent from consumers before running the software on their computers and implement a software security program on preloaded software for the next 20 years. 4 Vizio agreed to pay $2.2 million, delete the collected data, disclose all data collection and sharing practices, obtain express consent from consumers to collect or share their data, and implement a data security program. 5

The scenario: The FTC's action against toy manufacturer VTech was the first time the FTC became involved in a children's privacy and security matter. The settlement: In January 2018, the company entered into a settlement to pay $650,000 to resolve allegations it collected personal information from children without obtaining parental consent, in violation of COPPA. VTech was also required to implement a data security program that is subject to audits for the next 20 years. 6

The scenario: LabMD, a cancer-screening company, was accused by the FTC of failing to reasonably protect consumers' medical information and other personal data. Identity thieves allegedly obtained sensitive data on LabMD consumers due to the company's failure to properly safeguard it. The billing information of 9,000 consumers was also compromised. The settlement: After years of litigation, the case was heard before the U.S. Court of Appeals for the Eleventh Circuit. LabMD argued, in part, that data security falls outside of the FTC's mandate over unfair practices. The Eleventh Circuit issued a decision in June 2018 that, while not stripping the FTC of authority to police data security, did challenge the remedy imposed by the FTC. 7 The court ruled that the cease-and-desist order issued by the FTC against LabMD was unenforceable because the order required the company to implement a data security program that needed to adhere to a standard of "reasonableness" that was too vague. 8

The ruling points to the need for the FTC to provide greater specificity in its cease-and-desist orders about what is required by companies that allegedly fail to safeguard consumer data.

1 15 U.S.C. § 45(a)(1)

2 www.ftc.gov/news-events/press-releases/2018/04/uber-agrees-expanded-settlement-ftc-related-privacy-security

3 www.ftc.gov/system/files/documents/cases/emp_order_granting_default_judgment_6-22-18.pdf

4 www.ftc.gov/news-events/press-releases/2018/01/ftc-gives-final-approval-lenovo-settlement

5 www.ftc.gov/news-events/press-releases/2017/02/vizio-pay-22-million-ftc-state-newjersey-settle-charges-it

6 www.ftc.gov/news-events/press-releases/2018/01/electronic-toy-maker-vtech-settlesftc-allegations-it-violated

7 The United States Court of Appeals for the Third Circuit has rejected this argument. See FTC v. Wyndham Worldwide Corp., 799 F.3d 236, 247-49 (2015).

8 www.media.ca11.uscourts.gov/opinions/pub/files/201616270.pdf

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.

Photo of Marcia M. Ernst

United States

Mondaq uses cookies on this website. By using our website you agree to our use of cookies as set out in our Privacy Policy.

Social Media & Privacy: A Facebook Case Study

  • October 2015

Marise Haumann

Abstract and Figures

The number of active monthly Facebook users as of mid-2015 (in millions) (Statista, 2015).

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

wan norshira wan mohd ghazali

  • Wan Norshira Wan Mohd Ghazali

Shafizan Mohamed

  • Televis New Media
  • Nicole S. Cohen

Daniel Solove

  • J COMPUT-MEDIAT COMM
  • Bernhard Debatin

Jennette Lovejoy

  • Ann-Kathrin Horn M.A
  • Ann‐Kathrin Horn
  • Lect Notes Comput Sci
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

The International Forum for Responsible Media Blog

  • Table of Media Law Cases
  • About Inforrm
  • Search for: Search Button

Top 10 Privacy and Data Protection Cases of 2018: a selection

privacy media case study

  • Cliff Richard v. The British Broadcasting Corporation [2018] EWHC 1837 (Ch) .

This was Sir Cliff Richard’s privacy claim against the BBC and was the highest profile privacy of the year.  The claimant was awarded damages of £210,000. We had a case preview and case reports on each day of the trial and posts from a number of commentators including Paul Wragg , Thomas Bennett ( first and second ), Jelena Gligorijević . The BBC subsequently announced that it would not seek permission to appeal.

  • ABC v Telegraph Media Group Ltd [2018] EWCA Civ 2329 .

This was perhaps the second most discussed privacy case of the year.  The Court of Appeal allowed the claimants’ appeal and granted an interim injunction to prevent the publication of confidential information about alleged “discreditable conduct” by a high profile executive.  Lord Hain subsequently named the executive as Sir Philip Green.  We had a case comment from Persephone Bridgman Baker. We also had comments criticising Lord Hain’s conduct from Paul Wragg , Robert Craig and Tom Double .

  • Ali v Channel 5 Broadcast ( [2018] EWHC 298 (Ch)) .

The claimants had featured in a “reality TV” programme about bailiffs, “Can’t pay? We’ll Take it Away”. Their claim for misuse of private information was successful and damages of £20,000 were awarded. We had a case comment from Zoe McCallum. An appeal and cross appeal was heard on 4 December 2018 and judgment is awaited.

  • NT1 and NT2 v Google Inc [2018] 3 WLR 1165.

This was the first “right to be forgotten” claim in the English Courts – with claims in both data protection and privacy. Both claimants had spent convictions – one was successful and the other not.  We had a case preview from Aidan Wills and a comment on the case from Iain Wilson,

  • Lloyd v Google LLC [2018] EWHC 2599 (QB) .

This was an attempt to bring a “representative action” in data protection on behalf of all iPhone users in respect of the “Safari Workaround”. The representative claimant was refused permission to serve Google out of the jurisdiction.  We had a case comment from Rosalind English.  There was a Panopticon Blog post the case. The claimant has been given permission to appeal and it is likely that the appeal will be heard in late 2019.

  • TLU v Secretary of State for the Home Department [2018] EWCA Civ 2217 .

The Court of Appeal dismissed an appeal in a “data leak” case on the issue of liability to individuals affected by a data leak but not specifically named in the leaked document. We had a case comment from Lorna Skinner and further comment from Iain Wilson.  There was also a Panopticon Blog post .

  • Stunt v Associated Newspapers [2018] EWCA Civ 170 .

The Court of Appeal referred the question of whether the “journalistic exemption” in section 32(4) of the Data Protection Act 1998 is compatible with the Data Protection Directive and the EU Charter of Fundamental Rights to the CJEU.  There was a Panopticon Blog post on the case.

  • Various Claimants v W M Morrison Supermarkets plc [2018] EWCA Civ 2339 .

The Court of Appeal upheld the decision of Langstaff J that Morrisons were vicariously liable for a mass data breach caused by the criminal act of a rogue employee. We had a case comment from Alex Cochrane.  There was a Panopticon Blog post the case.

  • Big Brother Watch v. Secretary of State [2018] ECHR 722 .

An important case in which the European Court of Human Rights held that secret surveillance regimes including the bulk interception of external communications violated Articles 8 and 10 of the Convention. We had a post by Graham Smith as to the implications of this decision for the present regime.

  • ML and WW v Germany ( [2018] ECHR 554 ). 

This was the first case in the European Court of Human Rights on the “right to be forgotten”. This was an application under Article in respect of the historic publication by the media of information concerning a murder conviction.  The application was dismissed.  We had a case comment from Hugh Tomlinson and Aidan Wills.  There was also a Panopticon blog post on the case.

Share this:

Caselaw , Data Protection , Privacy

2018 Top 10 Privacy and Data Protection Cases

' src=

January 29, 2019 at 6:25 am

Reblogged this on | truthaholics and commented: “In this post we round up some of the most legally and factually interesting privacy and data protection cases from England and Europe from the past year.”

' src=

January 29, 2019 at 9:38 am

Reblogged this on tummum's Blog .

' src=

February 2, 2019 at 12:27 am

Very Nice and informative data…keep the good work going on

3 Pingbacks

  • Top 10 Privacy and Data Protection Cases of 2020: a selection – Suneet Sharma – Inforrm's Blog
  • Top 10 Privacy and Data Protection Cases of 2021: A selection – Suneet Sharma – Inforrm's Blog
  • Top 10 Privacy and Data Protection Cases 2022, a selection – Suneet Sharma – Inforrm's Blog

Leave a Reply Cancel reply

privacy media case study

Contact the Inforrm Blog

Inforrm  can be contacted by email [email protected]

Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address:

Sign me up!

Media Law Employment Opportunities

Penningtons Manches Cooper, Paralegal – Commercial Dispute Resolution (Reputation Management & Privacy)

Edwards Duthie Shamash, Media Law Associate, 3 – 5 years PQE

Schillings Senior Associate

Schillings Associate

Good Law Practice, Defamation Lawyer

Brett Wilson, NQ – 4 years’ PQE solicitor

Mishcon de Reya, Associate Reputation Protection, 1-4 PQE

Slateford, NQ – 2 years’ PQE solicitor

  • Top 10 Defamation Cases 2022: a selection - Suneet Sharma
  • Conference: Campbell at 20: What It Means and Why It Matters - Paul Wragg, Jeevan Hariharan, and Eliza Bechtold
  • Top 10 Privacy and Data Protection Cases of 2021: A selection - Suneet Sharma
  • Case Law: Fairhurst v Woodard, Neighbour CCTV harassment and data protection claim succeeds - Percy Preston
  • Top 10 Defamation Cases of 2023: a selection - Suneet Sharma

Recent Judgments

  • Artificial Intelligence
  • Bosnia Herzegovina
  • Broadcasting
  • Cybersecurity
  • Data Protection
  • Freedom of expression
  • Freedom of Information
  • Government and Policy
  • Human Rights
  • Intellectual Property
  • Leveson Inquiry
  • Media Regulation
  • New Zealand
  • Northern Ireland
  • Open Justice
  • Philippines
  • Phone Hacking
  • Social Media
  • South Africa
  • Surveillance
  • Uncategorized
  • United States

Search Inforrm’s Blog

  • Alternative Leveson 2 Project
  • Blog Law Online
  • Brett Wilson Media Law Blog
  • Canadian Advertising and Marketing Law
  • Carter-Ruck's News and Insights
  • Cearta.ie – The Irish for Rights
  • Centre for Internet and Society – Stanford (US)
  • Clean up the Internet
  • Cyberlaw Clinic Blog
  • Cyberleagle
  • Czech Defamation Law
  • David Banks Media Consultancy
  • Defamation Update
  • Defamation Watch Blog (Aus)
  • Droit et Technologies d'Information (France)
  • Fei Chang Dao – Free Speech in China
  • Guardian Media Law Page
  • Hacked Off Blog
  • Information Law and Policy Centre Blog
  • Internet & Jurisdiction
  • Internet Cases (US)
  • Internet Policy Review
  • Journlaw (Aus)
  • LSE Media Policy Project
  • Media Reform Coalition Blog
  • Media Report (Dutch)
  • Michael Geist – Internet and e-commerce law (Can)
  • Musings on Media (South Africa)
  • Paul Bernal's Blog
  • Press Gazette Media Law
  • Scandalous! Field Fisher Defamation Law Blog
  • Simon Dawes: Media Theory, History and Regulation
  • Social Media Law Bulletin (Norton Rose Fulbright)
  • Strasbourg Observers
  • Transparency Project
  • UK Constitutional Law Association Blog
  • Zelo Street

Blogs about Privacy and Data Protection

  • Canadian Privacy Law Blog
  • Data Matters
  • Data protection and privacy global insights – pwc
  • DLA Piper Privacy Matters
  • Données personnelles (French)
  • Europe Data Protection Digest
  • Mass Privatel
  • Norton Rose Fulbright Data Protection Report
  • Panopticon Blog
  • Privacy and Data Security Law – Dentons
  • Privacy and Information Security Law Blog – Hunton Andrews Kurth
  • Privacy Europe Blog
  • Privacy International Blog
  • Privacy Lives
  • Privacy News – Pogo was right
  • RPC Privacy Blog
  • The Privacy Perspective

Blogs about the Media

  • British Journalism Review
  • Jon Slattery – Freelance Journalist
  • Martin Moore's Blog
  • Photo Archive News

Blogs and Websites: General Legal issues

  • Carter-Ruck Legal Analysis Blog
  • Human Rights in Ireland
  • Human Rights Info
  • ICLR Case Commentary
  • Joshua Rozenberg Facebook
  • Law and Other Things (India)
  • Letters Blogatory
  • Mills and Reeve Technology Law Blog
  • Open Rights Group Blog
  • RPC's IP Hub
  • RPC's Tech Hub
  • SCOTUS Blog
  • The Court (Canadian SC)
  • The Justice Gap
  • UK Human Rights Blog
  • UK Supreme Court Blog

Court, Government, Regulator and Other Resource Sites

  • Australian High Court
  • Canadian Supreme Court
  • Commonwealth Legal Information Institute
  • Cour De Cassation France
  • European Data Protection Board
  • Full Fact.org
  • German Federal Constitutional Court
  • IMPRESS Project
  • Irish Supreme Court
  • New Zealand Supreme Court
  • NSW Case Law
  • Press Complaints Commission
  • Press Council (Australia)
  • Press Council (South Africa)
  • South African Constitutional Court
  • UK Judiciary
  • UK Supreme Court
  • US Supreme Court

Data Protection Authorities

  • Agencia Española de Protección de Datos (in Spanish)
  • BfDI (Federal Commissioner for Data Protection)(in German)
  • CNIL (France)
  • Danish Data Protection Agency
  • Data Protection Authority (Belgium)
  • Data Protection Commission (Ireland)
  • Dutch Data Protection Authority
  • Information Commissioner's Office
  • Italian Data Protection Authority
  • Scottish Information Commissioner
  • Swedish Data Protection Authority

Freedom of Expression Blogs and Sites

  • Backlash – freedom of sexual expression
  • Council of Europe – Freedom of Expression
  • EDRi – Protecting Digital Freedom
  • Free Word Centre
  • Freedom House Freedom of Expression
  • Freedom of Expression Institute (South Africa)
  • Guardian Freedom of Speech Page
  • Index on Censorship

Freedom of Information Blogs and Sites

  • All About Information (Can)
  • Campaign for Freedom of Information
  • David Higgerson
  • FreedomInfo.org
  • Open and Shut (Aus)
  • Open Knowledge Foundation Blog
  • The Art of Access (US)
  • The FOIA Blog (US)
  • The Information Tribunal
  • UCL Constitution Unit – FOI Resources
  • US Immigration, Freedom of Information Act and Privacy Act Facts
  • Veritas – Zimbabwe
  • Whatdotheyknow.com

Inactive and Less Active Blogs and Sites

  • #pressreform
  • Aaronovitch Watch
  • Atomic Spin
  • Bad Science
  • Banksy's Blog
  • Brown Moses Blog – The Hackgate Files
  • California Defamation Law Blog (US)
  • CYB3RCRIM3 – Observations on technology, law and lawlessness.
  • Data Privacy Alert
  • Defamation Lawyer – Dozier Internet Law
  • DemocracyFail
  • Entertainment & Media Law Signal (Canada)
  • Forty Shades of Grey
  • Greenslade Blog (Guardian)
  • Head of Legal
  • Heather Brooke
  • IBA Media Law and Freedom of Expression Blog
  • Information and Access (Aus)
  • Informationoverlord
  • ISP Liability
  • IT Law in Ireland
  • Journalism.co.uk
  • Korean Media Law
  • Legal Research Plus
  • Lex Ferenda
  • Media Law Journal (NZ)
  • Media Pal@LSE
  • Media Power and Plurality Blog
  • Media Standards Trust
  • Nied Law Blog
  • No Sleep 'til Brooklands
  • Press Not Sorry
  • Primly Stable
  • Responsabilidad En Internet (Spanish)
  • Socially Aware
  • Story Curve
  • Straight Statistics
  • Tabloid Watch
  • The IT Lawyer
  • The Louse and The Flea
  • The Media Blog
  • The Public Privacy
  • The Sun – Tabloid Lies
  • The Unruly of Law
  • UK FOIA Requests – Spy Blog
  • UK Freedom of Information Blog

Journalism and Media Websites

  • Campaign for Press and Broadcasting Freedom
  • Centre for Law, Justice and Journalism
  • Committee to Protect Journalists
  • Council of Europe – Platform to promote the protection of journalism and safety of journalists
  • ECREA Communication Law and Policy
  • Electronic Privacy Information Centre
  • Ethical Journalism Network
  • European Journalism Centre
  • European Journalism Observatory
  • Frontline Club
  • Hold the Front Page
  • International Federation of Journalists
  • Journalism in the Americas
  • Media Wise Trust
  • New Model Journalism – reporting the media funding revolution
  • Reporters Committee for Freedom of the Press
  • Reuters Institute for the Study of Journalism
  • Society of Editors
  • Sports Journalists Association
  • Spy Report – Media News (Australia)
  • The Hoot – the Media in the Sub-Continent

Law and Media Tweets

  • 1stamendment
  • DanielSolove
  • David Rolph
  • FirstAmendmentCenter
  • Guardian Media
  • Heather Brooke (newsbrooke)
  • humanrightslaw
  • Internetlaw
  • jonslattery
  • Kyu Ho Youm's Media Law Tweets
  • Leanne O'Donnell
  • Media Law Blog Twitter
  • Media Law Podcast
  • Siobhain Butterworth

Media Law Blogs and Websites

  • 5RB Media Case Reports
  • Ad IDEM – Canadian Media Lawyers Association
  • Entertainment and Sports Law Journal (ESLJ)
  • Gazette of Law and Journalism (Australia)
  • International Media Lawyers Association
  • Legalis.Net – Jurisprudence actualite, droit internet
  • Office of Special Rapporteur on Freedom of Expression – Inter American Commission on Human Rights
  • One Brick Court Cases
  • Out-law.com
  • EthicNet – collection of codes of journalism ethics in Europe
  • Handbook of Reuters Journalism
  • House of Commons Select Committee for Culture Media and Sport memoranda on press standards, privacy and libel

US Law Blogs and Websites

  • Above the Law
  • ACLU – Blog of Rights
  • Blog Law Blog (US)
  • Chilling Effects Weather Reports (US)
  • Citizen Media Law Project
  • Courthousenews
  • Entertainment and Law (US)
  • Entertainment Litigation Blog
  • First Amendment Center
  • First Amendment Coalition (US)
  • Free Expression Network (US)
  • Internet Cases – a blog about law and technology
  • Jurist – Legal News and Research
  • Legal As She Is Spoke
  • Media Law Prof Blog
  • Media Legal Defence Initiative
  • Newsroom Law Blog
  • Shear on Social Media Law
  • Student Press Law Center
  • Technology and Marketing Law Blog
  • The Hollywood Reporter
  • The Public Participation Project (Anti-SLAPP)
  • The Thomas Jefferson Centre for the Protection of Free Expression
  • The Volokh Conspiracy

US Media Blogs and Websites

  • ABA Media and Communications
  • Accuracy in Media Blog
  • Columbia Journalism Review
  • County Fair – a blog from Media Matters (US)
  • Fact Check.org
  • Media Gazer
  • Media Law – a blog about freedom of the press
  • Media Matters for America
  • Media Nation
  • Nieman Journalism Lab
  • Pew Research Center's Project for Excellence in Journalism
  • Regret the Error
  • Reynolds Journalism Institute Blog
  • Stinky Journalism.org
  • September 2024
  • August 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013
  • August 2013
  • February 2013
  • January 2013
  • December 2012
  • November 2012
  • October 2012
  • September 2012
  • August 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • September 2010
  • August 2010
  • February 2010
  • January 2010

© 2024 Inforrm's Blog

Theme by Anders Norén — Up ↑

Discover more from Inforrm's Blog

Subscribe now to keep reading and get access to the full archive.

Type your email…

Continue reading

privacy media case study

  • Case Study on Online Privacy
  • Markkula Center for Applied Ethics
  • Focus Areas
  • Internet Ethics
  • Internet Ethics Resources
  • Your Privacy Online

(AP Images/Seth Wenig) image link to story

Privacy, Technology, and School Shootings: An Ethics Case Study

The ethics of social media monitoring by school districts.

(AP Images/Seth Wenig)

(AP Images/Seth Wenig)

In the wake of recent school shootings that terrified both campus communities and the broader public, some schools and universities are implementing technical measures in the hope of reducing such incidents. Companies are pitching various services for use in educational settings; those services include facial recognition technology and social media monitoring tools that use sentiment analysis to try to identify (and forward to school administrators) student posts on social media that might portend violent actions.

A New York Times article notes that “[m]ore than 100 public school districts and universities … have hired social media monitoring companies over the past five years.” According to the article, the costs for such services range from a few thousand dollars to tens of thousands per year, and the programs are sometimes implemented by school districts without prior notification to students, parents, or school boards.

The social media posts that are monitored and analyzed are public. The monitoring tools use algorithms to analyze the posts.

A Wired magazine article tilted “ Schools Are Mining Students’ Social Media Posts for Signs of Trouble ” cites Amanda Lenhart, a scholar who notes that research has shown “that it’s difficult for adults peering into those online communities from the outside to easily interpret the meaning of content there.” She adds that in the case of the new tools being offered to schools and universities, the problem “could be exacerbated by an algorithm that can’t possibly understand the context of what it was seeing.”

Others have also expressed concerns about the effectiveness of the monitoring programs and about the impact they might have on the relationship between students and administrators. Educational organizations, however, are under pressure to show their communities that they are doing all they can to keep their students safe.

Discussion Questions

Are there some rights that come into conflict in this context? If so, what are they? What is the appropriate balance to strike between them? Why?

Do efforts like the social media monitoring serve the common good? Why, or why not? For a brief explanation of this concept, read “The Common Good.”

Does the fact that the social media posts being analyzed are public impact your analysis of the use of the monitoring technology? If so, in what way(s)?

Should universities not just notify students but also ask them for their input before implementing monitoring of student social media accounts? Why or why not?

Should high schools ask students for their input? Should they ask the students’ parents for consent? Why or why not?

According to The New York Times , a California law requires schools in the state “to notify students and parents if they are even considering a monitoring program. The law also lets students see any information collected about them and tells schools to destroy all data on students once they turn 18 or leave the district.” If all states were to pass similar laws, would that allay concerns you might have had about the monitoring practices otherwise? Why or why not?

Irina Raicu is the director of the Internet Ethics program at the Markkula Center for Applied Ethics.

Photo by AP Images/Seth Wenig

  • Defining Privacy
  • Privacy: A Quiz
  • Loss of Online Privacy: What's the Harm?
  • Nothing to Hide
  • Do You Own Your Data?
  • How to Protect Your Online Privacy
  • The Ethics of Online Privacy Protection

Additional Resources

  • Suggested Reading and Viewing Lists
  • A Framework for Thinking Ethically

Open Access is an initiative that aims to make scientific research freely available to all. To date our community has made over 100 million downloads. It’s based on principles of collaboration, unobstructed discovery, and, most importantly, scientific progression. As PhD students, we found it difficult to access the research we needed, so we decided to create a new Open Access publisher that levels the playing field for scientists across the world. How? By making research easy to access, and puts the academic needs of the researchers before the business interests of publishers.

We are a community of more than 103,000 authors and editors from 3,291 institutions spanning 160 countries, including Nobel Prize winners and some of the world’s most-cited researchers. Publishing on IntechOpen allows authors to earn citations and find new collaborators, meaning more people see your work not only from your own field of study, but from other related fields too.

Brief introduction to this section that descibes Open Access especially from an IntechOpen perspective

Want to get in touch? Contact our London head office or media team here

Our team is growing all the time, so we’re always on the lookout for smart people who want to help us reshape the world of scientific publishing.

Home > Books > Security and Privacy From a Legal, Ethical, and Technical Perspective

Social Media, Ethics and the Privacy Paradox

Submitted: 11 September 2019 Reviewed: 19 December 2019 Published: 05 February 2020

DOI: 10.5772/intechopen.90906

Cite this chapter

There are two ways to cite this chapter:

From the Edited Volume

Security and Privacy From a Legal, Ethical, and Technical Perspective

Edited by Christos Kalloniatis and Carlos Travieso-Gonzalez

To purchase hard copies of this book, please contact the representative in India: CBS Publishers & Distributors Pvt. Ltd. www.cbspd.com | [email protected]

Chapter metrics overview

4,040 Chapter Downloads

Impact of this chapter

Total Chapter Downloads on intechopen.com

IntechOpen

Total Chapter Views on intechopen.com

Overall attention for this chapters

Today’s information/digital age offers widespread use of social media. The use of social media is ubiquitous and cuts across all age groups, social classes and cultures. However, the increased use of these media is accompanied by privacy issues and ethical concerns. These privacy issues can have far-reaching professional, personal and security implications. Ultimate privacy in the social media domain is very difficult because these media are designed for sharing information. Participating in social media requires persons to ignore some personal, privacy constraints resulting in some vulnerability. The weak individual privacy safeguards in this space have resulted in unethical and undesirable behaviors resulting in privacy and security breaches, especially for the most vulnerable group of users. An exploratory study was conducted to examine social media usage and the implications for personal privacy. We investigated how some of the requirements for participating in social media and how unethical use of social media can impact users’ privacy. Results indicate that if users of these networks pay attention to privacy settings and the type of information shared and adhere to universal, fundamental, moral values such as mutual respect and kindness, many privacy and unethical issues can be avoided.

  • social media

Author Information

Nadine barrett-maitland *.

  • University of Technology, Jamaica, West Indies

Jenice Lynch

*Address all correspondence to: [email protected]

1. Introduction

The use of social media is growing at a rapid pace and the twenty-first century could be described as the “boom” period for social networking. According to reports provided by Smart Insights, as at February 2019 there were over 3.484 billion social media users. The Smart Insight report indicates that the number of social media users is growing by 9% annually and this trend is estimated to continue. Presently the number of social media users represents 45% of the global population [ 1 ]. The heaviest users of social media are “digital natives”; the group of persons who were born or who have grown up in the digital era and are intimate with the various technologies and systems, and the “Millennial Generation”; those who became adults at the turn of the twenty-first century. These groups of users utilize social media platforms for just about anything ranging from marketing, news acquisition, teaching, health care, civic engagement, and politicking to social engagement.

The unethical use of social media has resulted in the breach of individual privacy and impacts both physical and information security. Reports in 2019 [ 1 ], reveal that persons between the ages 8 and 11 years spend an average 13.5 hours weekly online and 18% of this age group are actively engaged on social media. Those between ages 12 and 15 spend on average 20.5 hours online and 69% of this group are active social media users. While children and teenagers represent the largest Internet user groups, for the most part they do not know how to protect their personal information on the Web and are the most vulnerable to cyber-crimes related to breaches of information privacy [ 2 , 3 ].

In today’s IT-configured society data is one of, if not the most, valuable asset for most businesses/organizations. Organizations and governments collect information via several means including invisible data gathering, marketing platforms and search engines such as Google [ 4 ]. Information can be attained from several sources, which can be fused using technology to develop complete profiles of individuals. The information on social media is very accessible and can be of great value to individuals and organizations for reasons such as marketing, etc.; hence, data is retained by most companies for future use.

Privacy or the right to enjoy freedom from unauthorized intrusion is the negative right of all human beings. Privacy is defined as the right to be left alone, to be free from secret surveillance, or unwanted disclosure of personal data or information by government, corporation, or individual ( dictionary.com ). In this chapter we will define privacy loosely, as the right to control access to personal information. Supporters of privacy posit that it is a necessity for human dignity and individuality and a key element in the quest for happiness. According to Baase [ 5 ] in the book titled “A Gift of Fire: Social, Legal and Ethical Issues for Computing and the Internet,” privacy is the ability to control information about one’ s self as well as the freedom from surveillance from being followed, tracked, watched, and being eavesdropped on. In this regard, ignoring privacy rights often leads to encroachment on natural rights.

Intrusion—this can be viewed as encroachment (physical or otherwise) on ones liberties/solitude in a highly offensive way.

Privacy facts—making public, private information about someone that is of no “legitimate concern” to anyone.

False light—making public false and “highly offensive” information about others.

Appropriation—stealing someone’s identity (name, likeness) to gain advantage without the permission of the individual.

Technology, the digital age, the Internet and social media have redefined privacy however as surveillance is no longer limited to a certain pre-defined space and location. An understanding of the problems and dangers of privacy in the digital space is therefore the first step to privacy control. While there can be clear distinctions between informational privacy and physical privacy, as pointed out earlier, intrusion can be both physical and otherwise.

This chapter will focus on informational privacy which is the ability to control access to personal information. We examine privacy issues in the social media context focusing primarily on personal information and the ability to control external influences. We suggest that breach of informational privacy can impact: solitude (the right to be left alone), intimacy (the right not to be monitored), and anonymity (the right to have no public personal identity and by extension physical privacy impacted). The right to control access to facts or personal information in our view is a natural, inalienable right and everyone should have control over who see their personal information and how it is disseminated.

“Freely given—an individual must be given a genuine choice when providing consent and it should generally be unbundled from other terms and conditions (e.g., access to a service should not be conditional upon consent being given).”

“Specific and informed—this means that data subjects should be provided with information as to the identity of the controller(s), the specific purposes, types of processing, as well as being informed of their right to withdraw consent at any time.”

“Explicit and unambiguous—the data subject must clearly express their consent (e.g., by actively ticking a box which confirms they are giving consent—pre-ticked boxes are insufficient).”

“Under 13s—children under the age of 13 cannot provide consent and it is therefore necessary to obtain consent from their parents.”

Arguments can be made that privacy is a cultural, universal necessity for harmonious relationships among human beings and creates the boundaries for engagement and disengagement. Privacy can also be viewed as instrumental good because it is a requirement for the development of certain kinds of human relationships, intimacy and trust [ 7 ]. However, achieving privacy is much more difficult in light of constant surveillance and the inability to determine the levels of interaction with various publics [ 7 ]. Some critics argue that privacy provides protection against anti-social behaviors such as trickery, disinformation and fraud, and is thought to be a universal right [ 5 ]. However, privacy can also be viewed as relative as privacy rules may differ based on several factors such as “climate, religion, technological advancement and political arrangements” [ 8 , 9 ]. The need for privacy is an objective reality though it can be viewed as “culturally rational” where the need for personal privacy is viewed as relative based on culture. One example is the push by the government, businesses and Singaporeans to make Singapore a smart nation. According to GovTech 2018 reports there is a push by the government in Singapore to harness the data “new gold” to develop systems that can make life easier for its people. The [ 10 ] report points out that Singapore is using sensors robots Smart Water Assessment Network (SWAN) to monitor water quality in its reservoirs, seeking to build smart health system and to build a smart transportation system to name a few. In this example privacy can be describe as “culturally rational” and the rules in general could differ based on technological advancement and political arrangements.

In today’s networked society it is naïve and ill-conceived to think that privacy is over-rated and there is no need to be concerned about privacy if you have done nothing wrong [ 5 ]. The effects of information flow can be complex and may not be simply about protection for people who have something to hide. Inaccurate information flow can have adverse long-term implications for individuals and companies. Consider a scenario where someone’s computer or tablet is stolen. The perpetrator uses identification information stored on the device to access their social media page which could lead to access to their contacts, friends and friends of their “friends” then participate in illegal activities and engage in anti-social activities such as hacking, spreading viruses, fraud and identity theft. The victim is now in danger of being accused of criminal intentions, or worse. These kinds of situations are possible because of technology and networked systems. Users of social media need to be aware of the risks that are associated with participation.

3. Social media

The concept of social networking pre-dates the Internet and mass communication as people are said to be social creatures who when working in groups can achieve results in a value greater than the sun of its parts [ 11 ]. The explosive growth in the use of social media over the past decade has made it one of the most popular Internet services in the world, providing new avenues to “see and be seen” [ 12 , 13 ]. The use of social media has changed the communication landscape resulting in changes in ethical norms and behavior. The unprecedented level of growth in usage has resulted in the reduction in the use of other media and changes in areas including civic and political engagement, privacy and safety [ 14 ]. Alexa, a company that keeps track of traffic on the Web, indicates that as of August, 2019 YouTube, Facebook and Twitter are among the top four (4) most visited sites with only Google, being the most popular search engine, surpassing these social media sites.

Social media sites can be described as online services that allow users to create profiles which are “public, semi-public” or both. Users may create individual profiles and/or become a part of a group of people with whom they may be acquainted offline [ 15 ]. They also provide avenues to create virtual friendships. Through these virtual friendships, people may access details about their contacts ranging from personal background information and interests to location. Social networking sites provide various tools to facilitate communication. These include chat rooms, blogs, private messages, public comments, ways of uploading content external to the site and sharing videos and photographs. Social media is therefore drastically changing the way people communicate and form relationships.

Today social media has proven to be one of the most, if not the most effective medium for the dissemination of information to various audiences. The power of this medium is phenomenal and ranges from its ability to overturn governments (e.g., Moldova), to mobilize protests, assist with getting support for humanitarian aid, organize political campaigns, organize groups to delay the passing of legislation (as in the case with the copyright bill in Canada) to making social media billionaires and millionaires [ 16 , 17 ]. The enabling nature and the structure of the media that social networking offers provide a wide range of opportunities that were nonexistent before technology. Facebook and YouTube marketers and trainers provide two examples. Today people can interact with and learn from people millions of miles away. The global reach of this medium has removed all former pre-defined boundaries including geographical, social and any other that existed previously. Technological advancements such as Web 2.0 and Web 4.0 which provide the framework for collaboration, have given new meaning to life from various perspectives: political, institutional and social.

4. Privacy and social media

Social medial and the information/digital era have “redefined” privacy. In today’s Information Technology—configured societies, where there is continuous monitoring, privacy has taken on a new meaning. Technologies such as closed-circuit cameras (CCTV) are prevalent in public spaces or in some private spaces including our work and home [ 7 , 18 ]. Personal computers and devices such as our smart phones enabled with Global Positioning System (GPS), Geo locations and Geo maps connected to these devices make privacy as we know it, a thing of the past. Recent reports indicate that some of the largest companies such as Amazon, Microsoft and Facebook as well as various government agencies are collecting information without consent and storing it in databases for future use. It is almost impossible to say privacy exists in this digital world (@nowthisnews).

The open nature of the social networking sites and the avenues they provide for sharing information in a “public or semi-public” space create privacy concerns by their very construct. Information that is inappropriate for some audiences are many times inadvertently made visible to groups other than those intended and can sometimes result in future negative outcomes. One such example is a well-known case recorded in an article entitled “The Web Means the End of Forgetting” that involved a young woman who was denied her college license because of backlash from photographs posted on social media in her private engagement.

Technology has reduced the gap between professional and personal spaces and often results in information exposure to the wrong audience [ 19 ]. The reduction in the separation of professional and personal spaces can affect image management especially in a professional setting resulting in the erosion of traditional professional image and impression management. Determining the secondary use of personal information and those who have access to this information should be the prerogative of the individual or group to whom the information belongs. However, engaging in social media activities has removed this control.

Privacy on social networking sites (SNSs) is heavily dependent on the users of these networks because sharing information is the primary way of participating in social communities. Privacy in SNSs is “multifaceted.” Users of these platforms are responsible for protecting their information from third-party data collection and managing their personal profiles. However, participants are usually more willing to give personal and more private information in SNSs than anywhere else on the Internet. This can be attributed to the feeling of community, comfort and family that these media provide for the most part. Privacy controls are not the priority of social networking site designers and only a small number of the young adolescent users change the default privacy settings of their accounts [ 20 , 21 ]. This opens the door for breaches especially among the most vulnerable user groups, namely young children, teenagers and the elderly. The nature of social networking sites such as Facebook and Twitter and other social media platforms cause users to re-evaluate and often change their personal privacy standards in order to participate in these social networked communities [ 13 ].

While there are tremendous benefits that can be derived from the effective use of social media there are some unavoidable risks that are involved in its use. Much attention should therefore be given to what is shared in these forums. Social platforms such as Facebook, Twitter and YouTube are said to be the most effective media to communicate to Generation Y’s (Gen Y’s), as teens and young adults are the largest user groups on these platforms [ 22 ]. However, according to Bolton et al. [ 22 ] Gen Y’s use of social media, if left unabated and unmonitored will have long-term implications for privacy and engagement in civic activities as this continuous use is resulting in changes in behavior and social norms as well as increased levels of cyber-crime.

Today social networks are becoming the platform of choice for hackers and other perpetrators of antisocial behavior. These media offer large volumes of data/information ranging from an individual’s date of birth, place of residence, place of work/business, to information about family and other personal activities. In many cases users unintentionally disclose information that can be both dangerous and inappropriate. Information regarding activities on social media can have far reaching negative implications for one’s future. A few examples of situations which can, and have been affected are employment, visa acquisition, and college acceptance. Indiscriminate participation has also resulted in situations such identity theft and bank fraud just to list a few. Protecting privacy in today’s networked society can be a great challenge. The digital revolution has indeed distorted our views of privacy, however, there should be clear distinctions between what should be seen by the general public and what should be limited to a selected group. One school of thought is that the only way to have privacy today is not to share information in these networked communities. However, achieving privacy and control over information flows and disclosure in networked communities is an ongoing process in an environment where contexts change quickly and are sometimes blurred. This requires intentional construction of systems that are designed to mitigate privacy issues [ 13 ].

5. Ethics and social media

Can this post be regarded as oversharing?

Has the information in this post been distorted in anyway?

What impact will this post have on others?

As previously mentioned, users within the ages 8–15 represent one of the largest social media user groups. These young persons within the 8–15 age range are still learning how to interact with the people around them and are deciding on the moral values that they will embrace. These moral values will help to dictate how they will interact with the world around them. The ethical values that guide our interactions are usually formulated from some moral principle taught to us by someone or a group of individuals including parents, guardians, religious groups, and teachers just to name a few. Many of the Gen Y’s/“Digital Babies” are “newbies” yet are required to determine for themselves the level of responsibility they will display when using the varying social media platforms. This includes considering the impact a post will have on their lives and/or the lives of other persons. They must also understand that when they join a social media network, they are joining a community in which certain behavior must be exhibited. Such responsibility requires a much greater level of maturity than can be expected from them at that age.

It is not uncommon for individuals to post even the smallest details of their lives from the moment they wake up to when they go to bed. They will openly share their location, what they eat at every meal or details about activities typically considered private and personal. They will also share likes and dislikes, thoughts and emotional states and for the most part this has become an accepted norm. Often times however, these shares do not only contain information about the person sharing but information about others as well. Many times, these details are shared on several social media platforms as individuals attempt to ensure that all persons within their social circle are kept updated on their activities. With this openness of sharing risks and challenges arise that are often not considered but can have serious impacts. The speed and scale with which social media creates information and makes it available—almost instantaneously—on a global scale, added to the fact that once something is posted there is really no way of truly removing it, should prompt individuals to think of the possible impact a post can have. Unfortunately, more often than not, posts are made without any thought of the far-reaching impact they can have on the lives of the person posting or others that may be implicated by the post.

6. Why do people share?

cause related

personal connection to content

to feel more involved in the world

to define who they are

to inform and entertain

People generally share because they believe that what they are sharing is important. It is hoped that the shared content will be deemed important to others which will ultimately result in more shares, likes and followers.

Figure 1 below sums up the findings of Berger and Milkman [ 25 ] which shows that the main reason people feel the need to share content on the varying social media platform is that the content relates to what is deemed as worthy cause. 84% of respondents highlighted this as the primary motivation for sharing. Seventy-eight percent said that they share because they feel a personal connection to the content while 69 and 68%, respectively said the content either made them feel more involved with the world or helped them to define who they were. Forty-nine percent share because of the entertainment or information value of the content. A more in depth look at each reason for sharing follows.

privacy media case study

Why people share source: Global Social Media Research. thesocialmediahat.com [ 26 ].

7. Content related to a cause

Social media has provided a platform for people to share their thoughts and express concerns with others for what they regard as a worthy cause. Cause related posts are dependent on the interest of the individual. Some persons might share posts related to causes and issues happening in society. In one example, the parents of a baby with an aggressive form of leukemia, who having been told that their child had only 3 months to live unless a suitable donor for a blood stem cell transplant could be found, made an appeal on social media. The appeal was quickly shared and a suitable donor was soon found. While that was for a good cause, many view social media merely as platforms for freedom of speech because anyone can post any content one creates. People think the expression of their thoughts on social media regarding any topic is permissible. The problem with this is that the content may not be accepted by law or it could violate the rights of someone thus giving rise to ethical questions.

8. Content with a personal connection

When social media users feel a personal connection to their content, they are more inclined to share the content within their social circles. This is true of information regarding family and personal activities. Content created by users also invokes a deep feeling of connection as it allows the users to tell their stories and it is natural to want the world or at least friends to know of the achievement. This natural need to share content is not new as humans have been doing this in some form or the other, starting with oral history to the media of the day; social media. Sharing the self-created content gives the user the opportunity of satisfying some fundamental needs of humans to be heard, to matter, to be understood and emancipated. The problem with this however is that in an effort to gratify the fundamental needs, borders are crossed because the content may not be sharable (can this content be shared within the share network?), it may not be share-worthy (who is the audience that would appreciate this content?) or it may be out of context (does the content fit the situation?).

9. Content that makes them feel more involved in the world

One of the driving factors that pushes users to share content is the need to feel more in tune with the world around them. This desire is many times fueled by jealousy. Many social media users are jealous when their friends’ content gets more attention than their own and so there is a lot of pressure to maintain one’s persona in social circles, even when the information is unrealistic, as long as it gets as much attention as possible. Everything has to be perfect. In the case of a photo, for example, there is lighting, camera angle and background to consider. This need for perfection puts a tremendous amount of pressure on individuals to ensure that posted content is “liked” by friends. They often give very little thought to the amount of their friend’s work that may have gone on behind the scenes to achieve that perfect social post.

Social media platforms have provided everyone with a forum to express views, but, as a whole, conversations are more polarized, tribal and hostile. With Facebook for instance, there has been a huge uptick in fake news, altered images, dangerous health claims and cures, and the proliferation of anti-science information. This is very distressing and disturbing because people are too willing to share and to believe without doing their due diligence and fact-checking first.

10. Content that defines who they are

Establishing one’s individuality in society can be challenging for some persons because not everyone wants to fit in. Some individuals will do all they can to stand out and be noticed. Social media provides the avenue for exposure and many individuals will seek to leverage the media to stand out of the crowd and not just be a fish in the school. Today many young people are currently being brought up in a culture that defines people by their presence on social media where in previous generations, persons were taught to define themselves by their career choices. These lessons would start from childhood by asking children what they wanted to be when they grew up and then rewarding them based on the answers they give [ 27 ]. In today’s digital era, however, social media postings and the number of “likes” or “dislikes” they attract, signal what is appealing to others. Therefore, post that are similar to those that receive a large number of likes but which are largely unrealistic are usually made for self-gratification.

11. Content that informs and entertains

The acquisition of knowledge and skills is a vital part of human survival and social media has made this process much easier. It is not uncommon to hear persons realizing that they need a particular knowledge set that they do not possess say “I need to lean to do this. I’ll just YouTube it.” Learning and adapting to change in as short as possible time is vital in today’s society and social media coupled with the Internet put it all at the finger tips. Entertainment has the ability to bring people together and is a good way for people to bond. It provides a diversion from the demands of life and fills leisure time with amusement. Social media is an outlet for fun, pleasurable and enjoyable activities that are so vital to human survival [ 28 ]. It is now common place to see persons watching a video, viewing images and reading text that is amusing on any of the available social media platforms. Quite often these videos, images and texts can be both informative and entertaining, but there can be problems however as at times they can cross ethical lines that can lead to conflict.

12. Ethical challenges with social media use

The use of modern-day technology has brought several benefits. Social media is no different and chief amongst its benefit is the ability to stay connected easily and quickly as well as build relationships with people with similar interests. As with all technology, there are several challenges that can make the use of social media off putting and unpleasant. Some of these challenges appear to be minor but they can have far reaching effects into the lives of the users of social media and it is therefore advised that care be taken to minimize the challenges associated with the use of social media [ 29 ].

A major challenge with the use of social media is oversharing because when persons share on social media, they tend to share as much as is possible which is often times too much [ 24 ]. When persons are out and about doing exciting things, it is natural to want to share this with the world as many users will post a few times a day when they head to lunch, visit a museum, go out to dinner or other places of interest [ 30 ]. While this all seems relatively harmless, by using location-based services which pinpoint users with surprising accuracy and in real time, users place themselves in danger of laying out a pattern of movement that can be easily traced. While this seems more like a security or privacy issue it stems from an ethical dilemma—“Am I sharing too much?” Oversharing can also lead to damage of user’s reputation especially if the intent is to leverage the platform for business [ 24 ]. Photos of drunken behavior, drug use, partying or other inappropriate content can change how you are viewed by others.

Another ethical challenge users of social media often encounter is that they have no way of authenticating content before sharing, which becomes problematic when the content paints people or establishments negatively. Often times content is shared with them by friends, family and colleagues. The unauthenticated content is then reshared without any thought but sometimes this content may have been maliciously altered so the user unknowingly participates in maligning others. Even if the content is not altered the fact that the content paints someone or something in a bad light should send off warning bells as to whether or not it is right to share the content which is the underlying principle of ethical behavior.

13. Conflicting views

Some of the challenges experienced by social media posts are a result of a lack of understanding and sometimes a lack of respect for the varying ethical and moral standpoints of the people involved. We have established that it is typical for persons to post to social media sites without any thought as to how it can affect other persons, but many times these posts are a cause of conflict because of a difference of opinion that may exist and the effect the post may have. Each individual will have his or her own ethical values and if they differ then this can result in conflict [ 31 ]. When an executive of a British company made an Instagram post with some racial connotations before boarding a plane to South Africa it started a frenzy that resulted in the executive’s immediate dismissal. Although the executive said it was a joke and there was no prejudice intended, this difference in views as to the implications of the post, resulted in an out of work executive and a company scrambling to maintain its public image.

14. Impact on personal development

In this age of sharing, many young persons spend a vast amount of time on social media checking the activities of their “friends” as well as posting on their own activities so their “friends” are aware of what they are up to. Apart from interfering with their academic progress, time spent on these posts at can have long term repercussions. An example is provided by a student of a prominent university who posted pictures of herself having a good time at parties while in school. She was denied employment because of some of her social media posts. While the ethical challenge here is the question of the employee’s right to privacy and whether the individual’s social media profile should affect their ability to fulfill their responsibilities as an employee, the impact on the individual’s long term personal growth is clear.

15. Conclusion

In today’s information age, one’s digital footprint can make or break someone; it can be the deciding factor on whether or not one achieves one’s life-long ambitions. Unethical behavior and interactions on social media can have far reaching implications both professionally and socially. Posting on the Internet means the “end of forgetting,” therefore, responsible use of this medium is critical. The unethical use of social media has implications for privacy and can result in security breaches both physically and virtually. The use of social media can also result in the loss of privacy as many users are required to provide information that they would not divulge otherwise. Social media use can reveal information that can result in privacy breaches if not managed properly by users. Therefore, educating users of the risks and dangers of the exposure of sensitive information in this space, and encouraging vigilance in the protection of individual privacy on these platforms is paramount. This could result in the reduction of unethical and irresponsible use of these media and facilitate a more secure social environment. The use of social media should be governed by moral and ethical principles that can be applied universally and result in harmonious relationships regardless of race, culture, religious persuasion and social status.

Analysis of the literature and the findings of this research suggest achieving acceptable levels of privacy is very difficult in a networked system and will require much effort on the part of individuals. The largest user groups of social media are unaware of the processes that are required to reduce the level of vulnerability of their personal data. Therefore, educating users of the risk of participating in social media is the social responsibility of these social network platforms. Adapting universally ethical behaviors can mitigate the rise in the number of privacy breaches in the social networking space. This recommendation coincides with philosopher Immanuel Kant’s assertion that, the Biblical principle which states “Do unto others as you have them do unto you” can be applied universally and should guide human interactions [ 5 ]. This principle, if adhered to by users of social media and owners of these platforms could raise the awareness of unsuspecting users, reduce unethical interactions and undesirable incidents that could negatively affect privacy, and by extension security in this domain.

  • 1. Chaffey D. Global Social Media Research. Smart Insights. 2019. Retrieved from: https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/
  • 2. SmartSocial. Teen Social Media Statistics (What Parents Need to Know). 2019. Retrieved from: https://smartsocial.com/social-media-statistics/
  • 3. Wisniewski P, Jia H, Xu H, Rosson MB, Carroll JM. Preventative vs. reactive: How parental mediation influences teens’ social media privacy behaviors. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing; ACM; 2015. pp. 302-316
  • 4. Chai S, Bagchi-Sen S, Morrell C, Rao HR, Upadhyaya SJ. Internet and online information privacy: An exploratory study of preteens and early teens. IEEE Transactions on Professional Communication. 2009; 52 (2):167-182
  • 5. Baase S. A Gift of Fire. Upper Saddle River, New Jersey: Pearson Education Limited (Prentice Hall); 2012
  • 6. Richards NM, Solove DJ. Prosser’s privacy law: A mixed legacy. California Law Review. 2010; 98 :1887
  • 7. Johnson DG. Computer ethics. In: The Blackwell Guide to the Philosophy of Computing and Information. Upper Saddle River, New Jersey: Pearson Education (Prentice Hall); 2004. pp. 65-75
  • 8. Cohen JE. What privacy is for. Harvard Law Review. 2012; 126 :1904
  • 9. Moore AD. Toward informational privacy rights. San Diego Law Review. 2007; 44 :809
  • 10. GOVTECH. Singapore. 2019. Retrieved from: https://www.tech.gov.sg/products-and-services/smart-nation-sensor-platform/
  • 11. Weaver AC, Morrison BB. Social networking. Computer. 2008; 41 (2):97-100
  • 12. Boulianne S. Social media use and participation: A meta-analysis of current research. Information, Communication and Society. 2015; 18 (5):524-538
  • 13. Marwick AE, Boyd D. Networked privacy: How teenagers negotiate context in social media. New Media & Society. 2014; 16 (7):1051-1067
  • 14. McCay-Peet L, Quan-Haase A. What is social media and what questions can social media research help us answer. In: The SAGE Handbook of Social Media Research Methods. Thousand Oaks, CA: SAGE Publishers; 2017. pp. 13-26
  • 15. Gil de Zúñiga H, Jung N, Valenzuela S. Social media use for news and individuals’ social capital, civic engagement and political participation. Journal of Computer-Mediated Communication. 2012; 17 (3):319-336
  • 16. Ems L. Twitter’s place in the tussle: How old power struggles play out on a new stage. Media, Culture and Society. 2014; 36 (5):720-731
  • 17. Haggart B. Fair copyright for Canada: Lessons for online social movements from the first Canadian Facebook uprising. Canadian Journal of Political Science (Revue canadienne de science politique). 2013; 46 (4):841-861
  • 18. Andrews LB. I Know Who You are and I Saw What You Did: Social Networks and the Death of Privacy. Simon and Schuster, Free Press; 2012
  • 19. Echaiz J, Ardenghi JR. Security and online social networks. In: XV Congreso Argentino de Ciencias de la Computación. 2009
  • 20. Barrett-Maitland N, Barclay C, Osei-Bryson KM. Security in social networking services: A value-focused thinking exploration in understanding users’ privacy and security concerns. Information Technology for Development. 2016; 22 (3):464-486
  • 21. Van Der Velden M, El Emam K. “Not all my friends need to know”: A qualitative study of teenage patients, privacy, and social media. Journal of the American Medical Informatics Association. 2013; 20 (1):16-24
  • 22. Bolton RN, Parasuraman A, Hoefnagels A, Migchels N, Kabadayi S, Gruber T, et al. Understanding Generation Y and their use of social media: A review and research agenda. Journal of Service Management. 2013; 24 (3):245-267
  • 23. Cohn C. Social Media Ethics and Etiquette. CompuKol Communication LLC. 20 March 2010. Retrieved from: https://www.compukol.com/social-media-ethics-and-etiquette/
  • 24. Nates C. The Dangers of Oversharing of Social Media. Pure Moderation. 2018. Retrieved from: https://www.puremoderation.com/single-post/The-Dangers-of-Oversharing-on-Social-Media
  • 25. Berger J, Milkman K. What makes online content go viral. Journal of Marketing Research. 2011; 49 (2):192-205
  • 26. The Social Media Hat. How to Find Amazing Content for Your Social Media Calendar (And Save Yourself Tons of Work). 29 August 2016. Retrieved from: https://www.thesocialmediahat.com/blog/how-to-find-amazing-content-for-your-social-media-calendar-and-save-yourself-tons-of-work/
  • 27. People First. Does what you do define who you are. 15 September 2012. Retrieved from: https://blog.peoplefirstps.com/connect2lead/what-you-do-define-you
  • 28. Dreyfus E. Does what you do define who you are. Psychologically Speaking. 2010. Retrieved from: https://www.edwarddreyfusbooks.com/psychologically-speaking/does-what-you-do-define-who-you-are/
  • 29. Business Ethics Briefing. The Ethical Challenges and Opportunities of Social Media Use. (Issue 66). 2019. Retrieved from: https://www.ibe.org.uk/userassets/briefings/ibe_social_media_briefing.pdf
  • 30. Staff Writer. The consequences of oversharing on social networks. Reputation Defender. 2018. Retrieved from: https://www.reputationdefender.com/blog/social-media/consequences-oversharing-social-networks
  • 31. Business Ethics Briefing. The Ethical Challenges of Social Media. (Issue 22). 2011. Retrieved from: https://www.ibe.org.uk/userassets/briefings/ibe_briefing_22_the_ethical_challenges_of_social_media.pdf

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Continue reading from the same book

Edited by Christos Kalloniatis

Published: 09 September 2020

By Ebru Celikel Cankaya

1115 downloads

By Aikaterini-Georgia Mavroeidi, Angeliki Kitsiou and...

984 downloads

By Bata Krishna Tripathy

1406 downloads

IntechOpen Author/Editor? To get your discount, log in .

Discounts available on purchase of multiple copies. View rates

Local taxes (VAT) are calculated in later steps, if applicable.

Support: [email protected]

privacy media case study

Social Media Case Law and the Expectation of Privacy: US v. Meregildo

Home » Resource Articles » Social Media Case Law and the Expectation of Privacy: US v. Meregildo

  • Social Media In Court

Over the last decade, social media platforms have grown in popularity. Virtually everyone uses some form of social media to connect with friends and family members. From Facebook to Instagram, it is not uncommon for an individual or organization to have a half dozen social media accounts. These social media accounts can offer invaluable insight during litigation. But is that information for public use or is it private? Is this information allowed as evidence in court? 

Social media and the expectation of privacy is a hot topic in courtrooms across the United States. To understand how social media privacy works during litigation , consider the case of United States v. Meregildo. 

Understanding the Case

In United States v. Meregildo , the defendant Melvin Colon wanted to suppress evidence that the government obtained from his Facebook account. The government accessed this information from his Facebook “friend” who cooperated with law enforcement. Colon’s Facebook “friend” gave the government access to view Colon’s profile.

The Judge in the case evaluated the evidence in the context of Colon’s privacy settings and his circle of friends. The Judge denied Colon’s motion to suppress and said that his Facebook information was lawfully obtained and useful in the case.  

The Judge emphasized the privacy settings used by Colon on his Facebook account. These privacy settings allowed the cooperating witness, Colon’s Facebook “friend,” to see the messages he posted to his account. As such, the Judge ruled that accessing this information was not a violation of the Fourth Amendment. Colon allowed “friends” to view his posts and he had a wide circle of friends. The Judge believed that because of this, Colon’s expectation of privacy ended when he posted on Facebook.

You can find more about United States v. Meregildo on our Social Media Case Law page .

Social Media and a Reasonable Expectation of Privacy

The Supreme Court has long held that a person has a protected right to a reasonable expectation of privacy. The constitution protects this right. However,  when it comes to social media, this expectation is not absolute and is frankly almost non-existent. This is because by design, many people can view the information. 

Facebook users can control privacy settings and determine who sees their posts and other information. In United States v. Meregildo, Colon posted messages detailing acts of violence and threatening new violence to rival gangs. His Facebook apparently had strict privacy settings allowing only “friends” to view his posts.

This case shows however that even if a user does have strict privacy settings, and allows only “friends” to view posts, he or she cannot claim that this information is private. “Friends” share the content they have access to with whoever they want. This provided the government with access and probable cause in a search warrant application.

Have a Social Media Investigation Team On Your Side

Keep in mind that just having access to online posts is often insufficient as social media evidence . You still need to prove authorship and authenticity . A screenshot or printout is generally not enough. Law firms and insurance companies should rely on the expertise of a social media investigation team for this type of investigation

Since 1988, Bosco Legal Services, Inc. has provided valuable services to law firms, insurance companies and businesses. As social media emerged, we tailored our investigations to include valuable online data. That is one reason we are a recognized leader in the field of social media investigations. 

Contact our social media investigators at Bosco Legal Services, Inc. to discuss your situation in greater detail. Based in California, we provide social media investigations nationwide. Call our office today at (877) 353-8281 to discuss your situation. You can also fill out our online contact form , and someone from our office will be in touch with you soon.

privacy media case study

  • Share full article

Advertisement

Supported by

The Battle for Digital Privacy Is Reshaping the Internet

As Apple and Google enact privacy changes, businesses are grappling with the fallout, Madison Avenue is fighting back and Facebook has cried foul.

privacy media case study

By Brian X. Chen

Listen to This Article

SAN FRANCISCO — Apple introduced a pop-up window for iPhones in April that asks people for their permission to be tracked by different apps.

Google recently outlined plans to disable a tracking technology in its Chrome web browser.

And Facebook said last month that hundreds of its engineers were working on a new method of showing ads without relying on people’s personal data.

The developments may seem like technical tinkering, but they were connected to something bigger: an intensifying battle over the future of the internet. The struggle has entangled tech titans, upended Madison Avenue and disrupted small businesses. And it heralds a profound shift in how people’s personal information may be used online, with sweeping implications for the ways that businesses make money digitally.

At the center of the tussle is what has been the internet’s lifeblood: advertising .

More than 20 years ago, the internet drove an upheaval in the advertising industry. It eviscerated newspapers and magazines that had relied on selling classified and print ads, and threatened to dethrone television advertising as the prime way for marketers to reach large audiences.

Instead, brands splashed their ads across websites, with their promotions often tailored to people’s specific interests. Those digital ads powered the growth of Facebook, Google and Twitter, which offered their search and social networking services to people without charge. But in exchange, people were tracked from site to site by technologies such as “ cookies, ” and their personal data was used to target them with relevant marketing.

Now that system, which ballooned into a $350 billion digital ad industry, is being dismantled. Driven by online privacy fears, Apple and Google have started revamping the rules around online data collection. Apple, citing the mantra of privacy, has rolled out tools that block marketers from tracking people. Google, which depends on digital ads, is trying to have it both ways by reinventing the system so it can continue aiming ads at people without exploiting access to their personal data.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

Privacy perception and protection on Chinese social media: a case study of WeChat

  • Original Paper
  • Published: 05 September 2018
  • Volume 20 , pages 279–289, ( 2018 )

Cite this article

privacy media case study

  • Zhen Troy Chen 1 &
  • Ming Cheung   ORCID: orcid.org/0000-0001-6711-9290 2  

6670 Accesses

33 Citations

1 Altmetric

Explore all metrics

In this study, the under-examined area of privacy perception and protection on Chinese social media is investigated. The prevalence of digital technology shapes the social, political and cultural aspects of the lives of urban young adults. The influential Chinese social media platform WeChat is taken as a case study, and the ease of connection, communication and transaction combined with issues of commercialisation and surveillance are discussed in the framework of the privacy paradox. Protective behaviour and tactics are examined through different perceptions of privacy in the digital age. The findings of this study suggest that users possess certain amount of freedoms on WeChat. However, users’ individual privacy attitudes and behaviour in practice suggest they have a declined sense of their own freedom and right to privacy. A privacy paradox exists when users, while holding a high level of concerns, in reality do little to further the protection of their personal information on WeChat. We argue that once a user has ingrained part of their social engagement within the WeChat system, the incentive for them to remain a part of the system outweighs their requirement to secure their privacy online as their decision-making is largely based on a simple cost-benefit analysis. The power and social capital yielded via WeChat is too valuable to give up as WeChat is widely used not only for private conversations, but also for study or work-related purposes. It further blurs the boundaries between the public, the professional and the private, which is a rather unique case compared with other social media around the world.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

privacy media case study

The Socio-economic Impacts of Social Media Privacy and Security Challenges

privacy media case study

Privacy and Empowerment in Connective Media

privacy media case study

Online Social Networks and Young People’s Privacy Protection: The Role of the Right to Be Forgotten

Explore related subjects.

  • Artificial Intelligence

Al-Kandari, A., Melkote, S. R., & Sharif, A. (2016). Needs and motives of Instagram users that predict self-disclosure use: A case study of young adults in Kuwait. Journal of Creative Communications , 11 (2), 85–101.

Article   Google Scholar  

Bae, Y. H., Jun, J. W., & Hough, M. (2016). Uses and gratifications of digital signage and relationships with user interface. Journal of International Consumer Marketing , 28 (5), 323–331.

Bartsch, M., & Dienlin, T. (2016). Control your Facebook: An analysis of online privacy literacy. Computers in Human Behavior , 56 , 147–154.

Büchi, M., Just, N., & Latzer, M. (2017). Caring is not enough: The importance of Internet skills for online privacy protection. Information, Communication & Society , 20 (8), 1261–1278.

Choi, B. C., Jiang, Z., Xiao, B., & Kim, S. (2015). Embarrassing exposures in online social networks: An integrated perspective of privacy invasion and relationship bonding. Information Systems Research , 26 (4), 675–694.

Clover, C. (2017). Overloaded China users battle ‘WeChat fatigue’. Financial Times . https://www.ft.com/content/51dfa598-0189-11e6-99cb-83242733f755 .

Debatin, B., Lovejoy, J. P., Horn, A. K., & Hughes, B. N. (2009). Facebook and online privacy: Attitudes, behaviors, and unintended consequences. Journal of Computer-Mediated Communication , 15 (1), 83–108.

Dhir, A., Kaur, P., Chen, S., & Lonka, K. (2016). Understanding online regret experience in Facebook use—effects of brand participation, accessibility and problematic use. Computers in Human Behavior , 59 , 420–430.

Du, P. (2015). Intercultural communication in the Chinese workplace . Basingstoke: Palgrave Macmillan.

Book   Google Scholar  

Fei, X. (1992). From the soil, the foundations of Chinese society: A translation of Fei Xiaotong’s Xiangtu Zhongguo (G. G. Hamilton & Z. Wang, Trans.). Berkeley: University of California Press.

Fisk, S. T., & Neuberg, S. L. (1990). A continuum of impression formation, from category-based to individuating processes: Influences of information and motivation on attention and interpretation. Advances in Experimental Social Psychology , 23 , 1–74.

Fox, J., & Moreland, J. (2015). The dark side of social networking sites: An exploration of the relational and psychological stressors associated with Facebook use and affordances. Computers in Human Behavior , 45 , 168–176.

Fulco, M. (2017). The WeChat economy, from messaging to payment and more. Cheung Kong Graduate School of Business . http://knowledge.ckgsb.edu.cn/2017/08/28/mobile-commerce/wechat-economy-messaging-wechat-pay/ .

Fulton, J. M., & Kibby, M. D. (2017). Millennials and the normalization of surveillance on Facebook. Continuum-Journal of Media & Culture Studies , 31 (2), 189–199.

Hodkinson, P. (2017). Bedrooms and beyond: Youth, identity and privacy on social network sites. New Media & Society , 19 (2), 272–288.

Jeong, Y., & Coyle, E. (2014). What are you worrying about on Facebook and Twitter? An empirical investigation of young social network site users’ privacy perceptions and behaviors. Journal of Interactive Advertising , 14 (2), 51–59.

Jeong, Y., & Kim, Y. (2017). Privacy concerns on social networking sites: Interplay among posting types, content, and audiences. Computers in Human Behavior , 69 , 302–310.

Kokolakis, S. (2017). Privacy attitudes and privacy behaviour: A review of current research on the privacy paradox phenomenon. Computers & Security , 64 , 122–134.

Lee, H., Park, H., & Kim, J. (2013). Why do people share their context information on social network services? A qualitative study and an experimental study on users’ behavior of balancing perceived benefit and risk. International Journal of Human Computer Studies , 71 (9), 862–877.

Li, H. S. (2016). Narrative dissidence, spoof videos and alternative memory in China. International Journal of Cultural Studies , 19 (5), 501–517.

McLaren, R. M., Dillard, J. P., Tusing, K. J., & Solomon, D. H. (2014). Relational framing theory: Utterance form and relational context as antecedents of frame salience. Communication Quarterly , 62 (5), 518–535.

McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology , 27 , 415–444.

Morando, F., Iemma, R., & Raiteri, E. (2014). Privacy evaluation: What empirical research on users’ valuation of personal data tells us. Internet Policy Review , 3 (2), 1–11.

Google Scholar  

Norberg, P. A., Horne, D., & Horne, D. (2007). The privacy paradox: Personal information disclosure intentions versus behaviors. Journal of Consumer Affairs , 41 (1), 100–126.

Raacke, J., & Bonds-Raacke, J. (2008). MySpace and Facebook: Applying the uses and gratifications theory to exploring friend-networking sites. Cyberpsychology Behavior , 11 (2), 169–174.

Rachels, J. (1975). Why privacy is important. Philosophy & Public Affairs , 4 (4), 323–333.

Rubinstein, I. S. (2013). Big data: The end of privacy or a new beginning?  International Data Privacy Law , 3 (2), 74–87.

Smith, C. (2018). 110 amazing WeChat statistics and facts (January 2018). DMR . https://expandedramblings.com/index.php/wechat-statistics/ .

Smith, H. J., Milberg, S. J., & Burke, S. J. (1996). Information privacy: Measuring individuals’ concerns about organizational practices. MIS Quarterly , 20 (2), 167–196.

Spottswood, E. L., & Hancock, J. T. (2017). Should I share that? Prompting social norms that influence privacy behaviors on a social networking site. Journal of Computer-Mediated Communication , 22 (2), 55–70.

Tengyun. (2017). Creator economy: A new era of the ‘Internet + Cultural and Creative’—social reception, industry ecology and policy system . Beijing: Development Research Centre of the State Council, Tencent Social Research Centre.

Walsh, M. J., & Baker, S. A. (2017). The selfie and the transformation of the public–private distinction. Information, Communication & Society , 20 (8), 1185–1203.

Walther, J. B., Van Der Heide, B., Kim, S. Y., Westerman, D., & Tong, S. T. (2008). The role of friends’ appearance and behavior on evaluations of individuals on Facebook: Are we known by the company we keep?  Human Communication Research , 34 (1), 28–49.

Yang, C.-C., Brown, B. B., & Braun, M. T. (2014). From Facebook to cell calls: Layers of electronic intimacy in college students’ interpersonal relationships. New Media & Society , 16 (1), 5–23.

Young, A. L., & Quan-Haase, A. (2013). Privacy protection strategies on Facebook: The Internet privacy paradox revisited. Information, Communication & Society , 16 (4), 479–500.

Zarouali, B., Ponnet, K., Walrave, M., & Poels, K. (2017). “Do you like cookies?” Adolescents’ skeptical processing of retargeted Facebook-ads and the moderating role of privacy concern and a textual debriefing. Computers in Human Behavior , 69 , 157–165.

Zlatolas, L., Welzer, T., Hericko, M., & Holbl, M. (2015). Privacy antecedents for SNS self-disclosure: The case of Facebook. Computers in Human Behavior , 45 , 158–167.

Download references

Author information

Authors and affiliations.

Xi′an Jiaotong-Liverpool University, Suzhou, China and University of Nottingham-Ningbo, Ningbo, China

Zhen Troy Chen

Griffith University, Brisbane, Australia

Ming Cheung

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ming Cheung .

Rights and permissions

Reprints and permissions

About this article

Chen, Z.T., Cheung, M. Privacy perception and protection on Chinese social media: a case study of WeChat. Ethics Inf Technol 20 , 279–289 (2018). https://doi.org/10.1007/s10676-018-9480-6

Download citation

Published : 05 September 2018

Issue Date : December 2018

DOI : https://doi.org/10.1007/s10676-018-9480-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Privacy paradox
  • Privacy protection
  • Social media
  • Young adults
  • Find a journal
  • Publish with us
  • Track your research

A Survey on Privacy in Social Media: Identification, Mitigation, and Applications A Survey on Privacy in Social Media: Identification, Mitigation, and Applications

ACM Trans. Data Sci., Vol. 1, No. 1, Article 7, Publication date: January 2020. DOI: https://doi.org/10.1145/3343038

The increasing popularity of social media has attracted a huge number of people to participate in numerous activities on a daily basis. This results in tremendous amounts of rich user-generated data. These data provide opportunities for researchers and service providers to study and better understand users’ behaviors and further improve the quality of the personalized services. Publishing user-generated data risks exposing individuals’ privacy. Users privacy in social media is an emerging research area and has attracted increasing attention recently. These works study privacy issues in social media from the two different points of views: identification of vulnerabilities and mitigation of privacy risks. Recent research has shown the vulnerability of user-generated data against the two general types of attacks, identity disclosure and attribute disclosure. These privacy issues mandate social media data publishers to protect users’ privacy by sanitizing user-generated data before publishing it. Consequently, various protection techniques have been proposed to anonymize user-generated social media data. There is vast literature on privacy of users in social media from many perspectives. In this survey, we review the key achievements of user privacy in social media. In particular, we review and compare the state-of-the-art algorithms in terms of the privacy leakage attacks and anonymization algorithms. We overview the privacy risks from different aspects of social media and categorize the relevant works into five groups: (1) social graphs and privacy, (2) authors in social media and privacy, (3) profile attributes and privacy, (4) location and privacy, and (5) recommendation systems and privacy. We also discuss open problems and future research directions regarding user privacy issues in social media.

ACM Reference format: Ghazaleh Beigi and Huan Liu. 2019. A Survey on Privacy in Social Media: Identification, Mitigation, and Applications. ACM Trans. Data Sci. 1, 1, Article 7 (January 2020), 38 pages. https://doi.org/10.1145/3343038

1 INTRODUCTION

The explosive Web growth in the past decade has drastically changed the way billions of people all around the globe conduct numerous activities such as surfing the web, creating online profiles in social media platforms, interacting with other people, and sharing posts and various personal information in a rich environment. This results in tremendous amounts of user-generated data. The massive amounts of user information and the availability of up-to-date data makes social media platforms an attractive target for organizations seeking to collect and aggregate this information either for legitimate purposes or nefarious goals [ 35 ]. For example,the user-generated data provide opportunities for researchers and business partners to study and understand individuals at unprecedented scales [ 19 , 28 ]. This information is also crucial for online vendors to provide personalized services, and a lack of it would result in a deteriorating quality of online personalization service [ 23 ].

On the other hand, tremendous amounts of user-generated data risk exposing individuals’ privacy due to its richness of content including a user's relationships and other private information [ 22 , 26 , 85 , 140 ]. These data also make online users traceable, and, accordingly, users become severely vulnerable to potential risks ranging from persecution by governments to targeted fraud. For example, users may share their vacation plans publicly on Twitter without knowing that this information could be used by adversaries for break-ins and thefts in the future [ 124 , 191 ]. Moreover, sensitive information that users do not usually explicitly disclose can be easily inferred from their activities in social media such as location [ 109 , 123 ], age [ 178 ], and trust/distrust relationships [ 27 , 29 , 30 ].

Privacy issues could be prominent when the data get published by a data publisher or service provider. In general, two types of information disclosures have been identified in the literature: identity disclosure and attribute disclosure attacks [ 51 , 103 , 107 ]. Identity disclosure occurs when an individual is mapped to an instance in a released dataset. Attribute disclosure happens when the adversary could infer some new information regarding an individual based on the released data. Attribute disclosure becomes more probable when there is accurate disclosure of people's identities. Similarly, privacy leakage attacks in social media could be also categorized into either identity disclosure or attribute disclosure. These user privacy issues mandate social media data publishers to protect users’ privacy by sanitizing user-generated data before they are published publicly.

Data anonymization is a complex problem, and its goal is to remove or perturb data to prevent adversaries from inferring sensitive information while ensuring the utility of the published data. One straightforward anonymization technique is to remove “Personally Identifiable Information” (a.k.a. PII) such as names, user ID, age, and location information. This solution has been shown to be far from sufficient in preserving privacy [ 19 , 139 ]. An example of this insufficient approach is the anonymized dataset published for the Netflix prize challenge. As a part of the Netflix prize contest, Netflix publicly released a dataset containing movie ratings of 500,000 subscribers. The data were supposed to be anonymized, and all PII is removed from it. Narayanan et al. [ 139 ] propose a de-anonymization attack that map users’ records in the anonymized dataset to corresponding profiles on IMDB. In particular, the results of this work show that the structure of the data carries enough information for a potential breach of privacy to re-identify anonymized users.

Consequently, various protection techniques have been proposed to anonymize user-generated social media data. In general, the ultimate goal of an anonymization approach is to preserve social media user privacy while ensuring the utility of published data. As a counterpart to this research direction, another group of works investigate the potential privacy breaches from social media user data by introducing new attacks. These works find the gaps in anonymizing user-generated data and further improve anonymization techniques.

There is vast literature on privacy of users in social media from many perspectives. Existing works cover three applications in social media, i.e., making connection with people, sharing contextual information, and receiving personalized services. Besides, users generate various types of data, including graph data, textual data, spatiotemporal data, and profile attribute data. This results in 12 pairs of applications and data type combination. We categorize existing works into five distinct categories to cover different combinations, including (1) social graphs and privacy, (2) authors in social media and privacy, (3) profile attributes and privacy, (4) location and privacy, and (5) recommendation systems and privacy. Table 1 shows how each category covers different combination of applications and data types. The goal of this article is to provide a comprehensive review of existing works on user privacy issues and solutions in social media and give a guidance on future research directions. The contributions of this survey are summarized as follows:

  • We give an overview of the traditional privacy models for structured data and discuss how these models are adopted for privacy issues in social media. We formally define two types of privacy leakage disclosures that covers most of the existing definitions in the literature.
  • We categorize privacy issues and solutions on social media into different groups: (1) social graphs and privacy, (2) authors in social media and privacy, (3) profile attributes and privacy, (4) location and privacy, and (5) recommendation systems and privacy. We overview existing works in each group with a principled way to group representative methods into different categories.
  • We discuss several open issues and provide future directions for privacy in social media.

The remainder of this survey is organized as follows. In Section 2 , we present an overview of traditional methods and formally define two types of privacy disclosures. In Section 3 , we review the state-of-the-art methods for privacy of social media graphs. More specifically, Section 3.1 . covers de-anonymization attacks in social graphs, and Section 3.2 . covers anonymization techniques that are proposed for preserving privacy of graph data against de-anonymization attacks. We review author identification works in Section 4 . In Sections 5 and 6 , we overview state-of-the-art de-anonymization techniques for inferring users profile attributes and location information. In Section 7 , privacy issues and solutions in recommendation systems are reviewed. Finally, we conclude this article in Section 8 by discussing the open issues and future directions.

2 TRADITIONAL PRIVACY MODELS

Privacy-preserving techniques were first introduced for tabular and micro data. With the emergence of social media, the issue of online user privacy was raised. Researchers then focus on studying privacy leakage issues as well as anonymization and privacy-preserving techniques specialized for social media data. There are two types of information disclosure in the literature: identity disclosure and attribute disclosure attacks [ 51 , 103 , 107 ]. We can formally define these attacks as follows:

Definition 2.1 (Identity Disclosure Attack). Given $T = (\mathbf {G}, \mathbf {A}, \mathbf {B})$ , which is a snapshot of a social media platform with a social graph $\mathbf {G} =(V,E)$ where $V$ is the set of users and $E$ demonstrates the social relations between them, a user behavior $\mathbf {A}$ and an attribute information $\mathbf {B}$ , the identity disclosure attack is to map all users in the list of target users $V_t$ to their known identities. For each $v \in V_t$ , we have the information of her social friends and behavior.

Definition 2.2 (Attribute Disclosure Attack). Given $T = (\mathbf {G}, \mathbf {A}, \mathbf {B})$ , which is a snapshot of a social media platform with a social graph $\mathbf {G} =(V,E)$ , where $V$ is the set of users and $E$ demonstrates the social relations between them, a user behavior $\mathbf {A}$ and an attribute information $\mathbf {B}$ , the attribute disclosure attack is used to infer the attributes $a_v$ for all $v \in V_t$ where $V_t$ is a list of targeted users. For each $v \in V_t$ , we have the information of her social friends and behavior.

Network graph de-anonymization and author identification are examples of identity disclosure attacks that exists in social media. Examples of attribute disclosure attack include the disclosure of users’ profile attributes, location, and preferences information in recommendation systems.

Before we discuss privacy leakage in social media, we overview the traditional privacy models for structured data such as $k$ -anonymity [ 171 ], $l$ -diversity [ 119 ], $t$ -closeness [ 107 ], and differential privacy [ 52 ]. These models are defined over structured databases and cannot be directly applied to unstructured user generated data. The reason is that quasi-identifiers and sensitive attributes are not clear in the context of social media data. These techniques are further adopted for social media data, which we will discuss more in the next sections. Finally, we discuss related work and highlight the differences between this work and other surveys in existing literature.

2.1 k-anonymity, l-diversity, and t-closeness

$k$ -anonymity was one of the first techniques introduced for protecting data privacy [ 171 ]. The aim of $k$ -anonymity is to anonymize each instance in the dataset so that it is indistinguishable from at least $k-1$ other instances with respect to certain identifying attributes. $k$ -anonymity could be achieved through suppression or generalization of the data instances. The goal here is to anonymize the data such that $k$ -anonymity is preserved for all instances in the dataset with a minimum number of generalizations and suppressions while maximizing the utility of the resultant data. It has been shown that this problem is NP-hard [ 4 ]. $k$ -anonymity was initially defined for tabular data, but then researchers start to adopt it for solving privacy issues in social media data. In social media related problems, $k$ -anonymity ensures that users cannot be identified and there are $k-1$ other users with the same set of features that makes these $k$ users indistinguishable. These features may include users’ attributes and structural properties.

Although $k$ -anonymity is among the first techniques proposed for protecting the privacy of datasets, it is still vulnerable against specific types of privacy leakage. Machanacajjhala et al. [ 119 ] introduces two simple attacks that defeats $k$ -anonymity. The first attack is homogeneity attack, in which the adversary can infer an instance's (in this case, a users in social media) sensitive attributes when sensitive values in an equivalence class lack diversity. In the second attack, the adversary can infer an instance's sensitive attributes when he or she has access to background knowledge even in the case where the data are $k$ -anonymized. The second attack is known as background knowledge attack. Variations of background knowledge attacks are proposed and used for inferring social media users’ attributes. The background knowledge could be users’ friends’ or behavioral information. We will discuss more about different types of the attribute inference attacks problem in Sections 6 and 7 .

To protect data against homogeneity and background knowledge attacks, Machanacajjhala et al. [ 119 ] introduce the concept of $l$ -diversity. It ensures that the sensitive attribute values in each equivalence class are diverse. More formally, a set of records in an equivalence is $l$ -diverse if the class contains at least $l$ well represented values for the sensitive attributes. The dataset is then $l$ -diverse if every class is $l$ -diverse. Two instantiations of the $l$ -diversity concept are then introduced, entropy $l$ -diversity and recursive $(c,l)$ -diversity. With entropy $l$ -diversity, each equivalence must not only have enough different sensitive values, but also each sensitive value must be distributed evenly enough. More formally, the entropy of the distribution of sensitive values in each equivalence class is at least $log(l)$ . For recursive $(c,l)$ -diversity, the most frequent value should appear frequent enough in the dataset. Interested readers could refer to the work of Reference [ 119 ] for more details.

After $l$ -diversity, Li et al. [ 107 ] studies the vulnerabilities of $l$ -diversity and introduce a new privacy concept, $t$ -closeness. They show that $l$ -diversity cannot protect the privacy of data when the distribution of sensitive attributes in the equivalence class is different from the distribution in the whole dataset. If the distribution of sensitive attributes is skewed, then $l$ -diversity presents a serious privacy risk. This attack is known as the skewness attack. $l$ -diversity is also vulnerable against similarity attacks. This attack can happen when the sensitive attributes in an equivalence class are distinct but semantically similar [ 107 ]. Li et al. [ 107 ] thus introduce a new privacy concept, $t$ -closeness, which ensures that the distribution of a sensitive attribute in any equivalence class is close to the distribution in the overall table. More formally speaking, an equivalence class satisfies $t$ -closeness if the distance between the distribution of a sensitive attribute in this class and the distribution in the whole dataset is no more than a certain threshold. The whole dataset is said to have $t$ -closeness if all equivalence classes have $t$ -closeness. It is valuable to mention that $t$ -closeness protects the data against attribute disclosure but not identity disclosure.

$k$ -anonymity, $l$ -diversity, and $t$ -closeness are further adopted for unstructured social media data. Table 2 summarizes different approaches that leverage adopted versions of these techniques for privacy problems in social media. These works are discussed more in the following sections.

Technique Type of Information Paper
$k$-degree anonymity graph structure [ ]
$k$-neighborhood anonymity graph structure [ ]
$k$-automorphism graph structure [ ]
$k$-isomorphic graph structure [ ]
$k$-anonymity graph structure and attribute information [ ]
$(\theta ,k)$-matching anonymity graph structure and attribute information [ ]
$(k,d)$-anonimity graph structure and attribute information [ ]
$l$-diversity attribute information [ ]
$t$-closeness attribute information [ ]

2.2 Differential Privacy

Differential privacy is a powerful technique that protects a user's privacy during statistical query over a database by minimizing the chance of privacy leakage while maximizing the accuracy of queries. It is introduced by Dwork et al. [ 52 , 53 ] and provides a strong privacy guarantee. The intuition behind differential privacy is that the risk of user's privacy leakage should not be increased as a result of participating in a database [ 52 ]. In particular, it imposes a guarantee on the data release mechanism rather than the dataset itself. The privacy risk is also evaluated according to the existence or absence of an instance in the database. Differential privacy assumes that data instances are independent from each other and guarantees that existence of an instance in the database does not pose a threat to its privacy as the statistical information of data would not change significantly in comparison to the case that the instance is absent [ 52 , 53 ]. This way, the adversary cannot infer whether an instance is in the database or not or which record is associated with it [ 92 ].

Definition 2.3 (Differential Privacy). Given a query function $f(.)$ , a mechanism $K(.)$ with an output range $\mathcal {R}$ satisfies $\epsilon$ -differential privacy for all datasets $\mathcal {D}_1$ and $\mathcal {D}_2$ differing in at most one element iff :

Here $\epsilon$ is called privacy budget and large values of $\epsilon$ (e.g., 10) results in large $e^{\epsilon }$ and indicates that large output difference could be tolerated and hence we have large privacy loss. This is because the adversary can infer the change in the database according to the large change of the query function $f(.)$ . On the other hand, small values of $\epsilon$ (e.g., 0.1) indicate that small privacy loss could be tolerated. Query function $f(.)$ can be thought of as a request about value of a random variable and mechanism $K(.)$ is also a randomized function that can be considered as an algorithm that returns the results for the query function, possibly with some noise. To make it more clear, let us assume that we have a dataset containing every patient information. An example of the query function $f(.)$ could be the question, How many people have the disease $x$ ? The mechanism $K(.)$ could be any algorithm that finds the answer to this question. The output range $\mathcal {R}$ for mechanism $K(.)$ in this example is $\mathcal {R} = \lbrace 1,2,\ldots,n\rbrace$ , where $n$ is the total number of patients in the dataset.

Differential privacy models could be either interactive or non-interactive. Assume that the data consumer executes a number of statistical queries on the same dataset. In the interactive models, the data publisher responds to the customer with $K(f(\mathcal {D}))$ , where $K(.)$ perturbs the query results to achieve the privacy guarantees. In non-interactive models, the data publisher designs a mechanism $K(.)$ , which transforms the original data $\mathcal {D}$ into a new anonymized dataset $\mathcal {D}^{\prime } = K(f(\mathcal {D}))$ . The perturbed data $\mathcal {D}^{\prime }$ are then returned to the consumer, which is ready for arbitrary statistical queries.

A common way of achieving differential privacy is through adding random noises, i.e., Laplacian or Exponential to the query answers [ 52 ]. The Laplacian mechanism is a popular technique for providing $\epsilon$ -differential privacy that adds Laplace noise drawn from Laplace distribution. Since $\epsilon$ -differential privacy is defined over the query function and holds for all datasets according to Equation ( 1 ), the amount of added noise only depends on the sensitivity of the query function. Sensitivity of the query function is further defined as:

The added Laplacian noise is then drawn from $Lap(\Delta (f)/\epsilon) \propto e^{-\epsilon /\Delta (f)}$ , and the output result considering differential privacy constraint will be $K(f(\mathcal {D})) = f(\mathcal {D}) + Y$ , where $Y\sim Lap(\Delta (f)/\epsilon)$ . The mechanism $K(.)$ works best when $\Delta (f)$ is small as it introduces the least noise. The larger the sensitivity of a query, the less privacy risks can be tolerated, as removing any instance from the dataset would change the output of the query more. Note that the sensitivity basically captures how a great difference (between the value of $f(.)$ on two datasets differing in a single element) must be hidden by the additive noise generated by the data publisher. Note that recent studies show that the dependency between instances in the dataset will hurt the differential privacy guarantees [ 92 , 113 ].

There also exists a relaxed version of $\epsilon$ -differential privacy, known as $(\epsilon , \delta)$ -differential privacy, which was developed to deal with very unlikely outputs of $K(.)$ [ 52 , 53 ]. It could be defined as:

Definition 2.4 (( $\epsilon , \delta$ )-differential privacy). Given a query function $f(.)$ , a mechanism $K(.)$ with an output range $\mathcal {R}$ satisfies $(\epsilon , \delta)$ -differential privacy for all datasets $\mathcal {D}_1$ and $\mathcal {D}_2$ differing in at most one element iff :

Table 3 summarizes different works that utilize differential privacy in social media data. All these works are discussed more later.

Type of Information Reference
graph structure [ , , , , ]
recommender systems [ , , , , , , , , ]
textual data [ ]

2.3 Related Work

There are multiple relevant surveys related to the privacy of data and privacy-preserving approaches [ 1 , 5 , 54 , 59 , 82 , 86 , 159 , 165 , 176 , 193 ]. Fung et al. [ 59 ] reviews privacy-preserving data publishing methods for relational data such as $k$ -anonymity, $l$ -diversity, $t$ -closeness and their other variations. These methods are compared in terms of privacy models, anonymization algorithms, and information metrics. Zhelva et al. [ 193 ] review the concepts of privacy issues in tabular data and introduce new privacy risks in graph data. Multiple surveys focus on reviewing graph data privacy risks [ 1 , 82 , 86 , 165 ]. Sharma et al. [ 165 ] are among the first works that reviews $k$ -anonymity and randomization-based techniques for anonymizing graph data. Another overview by Abawajy et al. [ 1 ] presents the threat model for graph data and classified the background knowledge that is used by adversaries to breach the privacy of users. They also review and classify state-of-the-art approaches for anonymizing graph data. Ji et al. [ 82 , 86 ] conducted a survey on graph data anonymization, de-anonymization attacks, and de-anonymizability quantification. Another way of sanitizing data is by providing algorithms that are provably privacy-preserving and ensure no sensitive information leak from the data [ 193 ]. There is a thorough survey [ 176 ] on privacy-preserving data mining, which studies different privacy-preserving data mining approaches. Another work, from Agrawal et al. [ 5 ], proposes algorithms to perturb data values by adding random noise to them. Another set of works focuses on developing privacy-preserving association mining rules to minimize privacy loss [ 54 , 159 ].

In this work, we go one step further and review all aspects of social media data that could lead to privacy leakage. Social media data are highly unstructured and noisy and inherently different from relational and tabular data. Therefore, other approaches are designed specifically to study privacy risks in the context of user-generated data in social media platforms. Different from previous works, we not only reviews state-of-the-art and recent approaches on social graph anonymization and de-anonymization, but we also survey other attribute and identity disclosure attacks that could be performed on the other aspects of user-generated social media data. In addition, we overview and summarize approaches that leaks users’ profile attribute and location information utilizing their other online activities. We also survey author identification techniques that incorporate various pieces of user-generated information such as user profiles and textual posts to re-identify users. Besides, we cover more recent works related to privacy leakage in social media that are not covered in the work of Zhelva et al. [ 193 ]. Furthermore, we include many new techniques related to the privacy of social graphs that are not included in previous surveys [ 1 , 82 , 86 , 165 ].

In summary, to the best of our knowledge, this is the first and most comprehensive work that systematically surveys and analyzes the advances of research on privacy issues in social media.

3 SOCIAL GRAPHS AND PRIVACY

A large amount of data generated by users in social media platforms has graph structure. Friendship and following/followee relations, mobility traces (e.g., WiFi contacts, Instant Message contacts), and spatio-temporal data (latitude, longitude, timestamps) all could be modeled as graphs. This mandates paying attention to privacy issues of graph data. We will first overview graph de-anonymization works and then survey the proposed solutions for anonymizing graph data.

3.1 Graph De-anonymization

The work of Backstrom et al. [ 19 ] was among the first that studied the privacy breach problem related to the social network's graph structure. These attacks could be categorized as either a seed-based or seed-free approach according to whether pre-annotated seed users existed or not. Seed users are those whom their identity are clear for the attacker. Backstrom et al. [ 19 ] is among the first seed-based approaches. This work introduces both active and passive attacks on anonymized social networks. In active attacks, the adversary creates $k$ new user accounts (a.k.a. Sybils) and links them to the set of predefined target nodes before the anonymized graph is produced. Then it links these new accounts together to create a subgraph $H$ . After publishing the anonymized graph, the attacher looks for the subgraph $H$ and then locates and re-identifies targeted nodes in the published graph. The main challenge here is that the subgraph $H$ should be unique enough to be found efficiently. In passive attacks, the adversary is an internal user of the system and no new account is created. The attacker then de-anonymizes the users connected to him after the graph data is released. This attack is susceptible to Sybil defense approaches [ 8 ] and wrongly assumes that attackers can always change the network before its release.

Another work, from Narayanan et al. [ 140 ], introduces an improved attack that does not need compromised accounts or Sybil users to perform the attack. This work assumes that the attacker has access to a different network whose membership has overlap with the original anonymized network. This auxiliary graph is also known as background or auxilary graph knowledge. It also assumes that the attacker has the information of a small set of users, i.e., seed users, who are present in both networks. Narayanan et al. [ 140 ] discuss different ways of collecting background knowledge. For example, if the attacker is a friend of a portion of the targeted users, then he or she knows all the details about them [ 98 , 170 ]. Another approach is paying a set of users to reveal information about themselves and their friends [ 106 ]. Crawling data via social media API or using compromised accounts as discussed in active attack are other approaches for gathering background knowledge. Social graph de-anonymization attack in social media could be then formally defined as:

Definition 3.1 (Social Graph De-anonymization Attack [ 57 , 140 ]). Given an auxiliary/ background graph $G_1 = (V_1, E_1)$ and a target anonymized graph $G_2 = (V_2, E_2)$ , the goal of de-anonymization is to find identity disclosures in the form of $1-1$ mappings as many and accurately as possible. An identity disclosure indicates that the two nodes $i \in V_1$ and $j \in V_2$ actually correspond to the same user.

3.1.1 Seed-based De-anonymization. Seed-based de-anonymziation approaches have two main steps. In the first step, a set of seed users are mapped from the anonymized graph to the background/auxiliary graph knowledge and thus are re-identified. In the second step, the mapping and de-anonymization is propagated from the seed users to the other remaining unidentified users. Similarly, the work of Narayanan et al. [ 140 ] starts from re-identifying seed users in an anonymized and auxiliary graph. Then, other users are re-identified by propagating mappings based on seed users pairs. Structural information such as user's degree, user's eccentricity, and edge directionality are used to heuristically measures the strength of match between users. A straightforward application of this de-anonymization attack with less heuristics is predicting links between users [ 138 ].

Yartseva et al. [ 185 ] propose a percolation-based de-anonymization approach that maps every pair of users in both graphs (background knowledge and anonymized graphs) that have more than $k$ neighboring mapped pairs. The only parameter of this work is $k$ , which is a predefined mapping threshold and does not require a minimum number of users in the seed set. Another similar work, from Korula et al. [ 99 ], proposes a parallelizable percolation-based attack with provable guarantees. It again starts with a set of seed users who are previously mapped and then propagates the mapping to the remaining network. Two users will be mapped if they have a specific number of mapped neighbors. Their approach is robust to malicious users and fake social relationships in the network.

In another work, Nilizadeh et al. [ 142 ] propose a community-based de-anonymization attack using the idea of divide-and-conquer. Community detection has been extensively studied in the literature of social network analysis [ 12 , 184 ] and has been used in variety of tasks such as trust prediction [ 24 ] and guild membership prediction [ 13 , 69 ]. In this work, the attacker first partitions both graphs (i.e., anonymized and knowledge graphs) into multiple communities. It then maps communities by creating a network of communities in both graphs. Then users within mapped communities are re-identified and matched together. Mappings are then propagated to re-identify the remaining users. This attack uses similar heuristics as [ 140 ] to measure the mapping strength between users.

Ji et al. [ 80 , 81 ] study de-anonymizability of social media graph data based on seed-based approaches under both the Erdos-Renyi and a statistical model. Similarly to Reference [ 83 ], they specified the structure conditions for both perfect and partial de-anonymization. Chiasserini et al. [ 45 , 55 ] also study the problem of user de-anonymization according to their structural information under the scale-free user relation model. This assumption is more realistic, since users degree-distribution in social media follows power-law distribution, a.k.a. scale-free. Their results show that the information of a large portion of users in the seed set is useless in re-identifying users. This because of the large inhomogeneities in the users degree. This suggests that given a network with $n$ users, the order of $n^{\frac{1}{2}+\epsilon }$ (for any arbitrarily small $\epsilon$ ) seeds are needed to successfully de-anonymize all users when seeds are uniformly distributed among the vertices. Chiasserini et al. [ 45 , 46 ] also propose a two-phased percolation graph matching-based attack similar to that in Reference [ 185 ].

Bringmann et al. [ 38 ] also propose an approach that uses $n^\epsilon$ seed nodes (for an arbitrarily small $\epsilon$ ) for a graph with $n$ nodes. This is an improvement over the state-of-the-art structure-based de-anonymization techniques that need $\Theta (n)$ seeds [ 99 ]. This approach then finds a signature set for each node as the intersection of its neighbors and previously re-identified nodes. It then defines criterion that further is used to decide if two signatures originate from same nodes with high probability or not, i.e., if the similarity of two nodes signature is more than $n^c$ ( $c\gt 0$ is a constant), then the two nodes are mapped together. Local sensitivity hashing technique [ 78 ] is also used to reduce the number of comparisons needed for the de-anonymization attack. Theoretical and empirical analysis of their work show that the attack is performed in quasilinear time.

Manasa et al. [ 150 ] propose another seed-based attack against anonymized social graphs that has two steps. In the first step, it identifies a seed sub-graph of users with known identities. As discussed earlier in Reference [ 19 ], this sub-graph could be injected by an attacker or it could even be a small group of users that the attack is able to re-identify. In the second step, it extends the seed set based on the users’ social relations and re-identifies the remaining users. For each mapping iteration, the algorithm re-examines previous mapping decisions, given new evidence regarding re-identified nodes. This attack does not have any limitation on the size of the initial seed and the number of links between seeds. Another recent work, by Chiasserini et al. [ 46 ], incorporates clustering for de-anonymization attacks. Their attack uses various levels of clustering and their theoretical results highlight that clustering can potentially reduce the number of seeds in percolation-based de-anonymization attacks due to its wave-like propagation effect. This attack is a modified version of that in Reference [ 185 ], which starts from a small set of seed users and then expands seed set to the closest neighbors of the users in the seed set and repeat the re-identification procedure. In this version, two users are mapped if they have a sufficiently large number of neighbors among the mapped pairs.

To sum-up seed-based graph de-anonymization techniques could be categorized into three groups, percolation-based, clustering-based and seed extension-based works. Table 4 summarizes existing works according to the utilized technique and their properties.

3.1.2 Seed-free De-anonymizatoin. The efficiency of most of seed-based approaches depends on the size of seed set. Seed-free de-anonymization attacks have been developed to solve his issue. Pedarsani et al. [ 149 ] present a Bayesian model that starts from the users with the highest degree and iteratively solves a maximum weighted bipartite graph matching problem. This algorithm iteratively updates fingerprints of all users. The goal in the maximum bipartite graph matching problem is to find a maximum matching between two parties so that each vertex is the endpoint of exactly one of the chosen edges.

Moreover, Ji et al. [ 83 , 84 ] propose to use optimization-based methods to minimize an error function iteratively. More specifically, in each iteration of this attack, two candidate sets of users are selected from the anonymized and background graphs. Then users in the set from the anonymized graph are mapped (de-anonymized) to users in background graph by minimizing an error function defined by the edge difference caused by a mapping scheme. In particular, Ji et al. [ 83 ] quantify the structure-based de-anonymization under the Configuration model [ 141 ] and drive structural conditions for perfect and partial de-anonymization. The configuration Model generates a random graph given a degree sequence by randomly assigning edges to match the given degree sequence [ 141 ].

Another recently developed group of techniques leverages additional sources of information besides structural network to re-identify social media users in anonymized data. This information includes user interactions (e.g., commenting, tweeting) or non-personal identifiable information that is associated with users and is shared publicly such as gender, education, country and interests [ 64 ]. This combination of structural and exogenous sources of information could increase the of risk user privacy. Zhang et al. [ 190 ] study the privacy breach problem in anonymized heterogeneous networks. They first introduce a privacy risk measure based on the potential loss of the user and the number of users who have same value. They then propose a de-anonymization algorithm that incorporates the defined privacy risk measure. For each target user, this framework first finds a set of candidates based on entity attribute matches in the heterogeneous network and then narrows down this candidate set by comparing the neighbors (which are found via heterogeneous links) of the target user and each candidate.

Fu et al. [ 56 , 57 ] propose to use structural and descriptive information. Descriptive information is defined as attribute information such as name, gender, and birth year. This work first proposes a new definition of user similarity, i.e., two users are similar if their neighbors match to each other as well. However, similarity of neighbors also depends on the similarity of users. Therefore, Fu et al. model similarity as a recursive problem and solves it iteratively. Then, they reduce the de-anonymization problem to a complete weighted bipartite graph matching that is solved with Hungarian algorithm [ 101 ]. These weights here are calculated based on the users similarities.

In another work, the effect of user attribute information as an exogenous source of information on de-anonymizing social networks is studied [ 154 ]. In particular, this work incorporates semantic background knowledge of adversary in the de-anonymization process and models it using knowledge graphs [ 79 ]. This approach simultaneously de-anonymizes and infers users attributes (we will discuss user profile attribute inference attack later in Section 5 ). The adversary first models both the de-anonymized dataset and the background knowledge as two knowledge graphs. Then, she makes a complete adversary weighted bipartite graph. Each weight indicates the structural and attribute similarity between corresponding nodes in the anonymized and knowledge graphs. The de-anonymization problem will be then reduced to a maximum weighted bipartite matching problem that can be furthered reduced to a minimum cost maximum flow problem. Attacker prior semantic knowledge could be obtained via different ways such as common sense, statistical information, personal information, and network structural information.

Ji et al. [ 87 ] also study the same problem and show theoretically and empirically that using attribute information alongside structural information could result in a great privacy loss even in an anonymized dataset in comparison to the cases where the data only consists of structural information. They further propose the De-SAG de-anonymization framework, which incorporates both attribute and structural information. It first augments both types of information into a structure-attribute graph. De-SAG has two variants, i.e., user based and set based. In user-based De-SAG, the proposed de-anonymization approach first selects the most similar candidates to the target user from background/auxiliary knowledge graph based on similarity of their attributes. Next, the target user will be mapped to one of the selected candidates based on their structural similarity. In set-based De-SAG, for each iteration, two sets of users are selected from anonymized graph and knowledge graph, respectively. Then, the de-anonymization problem reduces to a maximum weighted bipartite graph Matching problem and users in these two sets are mapped to each other using Hungarian algorithm [ 101 ]. Note that the similarity of users are again calculated according to their attribute and structural information.

In another work, by Lee et al. [ 105 ], a blind de-anonymization technique is proposed in which the adversary does not need to have any background information. Inspired by the idea of $dK$ -series for chatacterizing structural characteristics of a graph, they propose $nK$ -series to describe structural features of each user by exploiting his multi-hop neighbors information. In particular, $nKi$ captures the degree histogram of the user's $i$ -hop neighbors. Then, a structure score is calculated for each user (in both the anonymized graph and the background knowledge graph) based on his diversity score (calculated according to $nK$ -series scores) and his relationships with all other non-reidentified users in the network. It then uses this information to re-identify all users in the anonymized social graph by leveraging pseudo relevance feedback support vector machines. Backes et al. [ 18 ] develop an attack that infers social links between users based on their mobility profiles without using any additional information about existing relations between users. Their approach first constructs mobility profile for each user by obtaining random walk traces from the user-location bipartite graph and using skip-gram [ 131 ] to obtain features in a continuous vector space. It then infers the links based on the similarity of their mobility profile.

Beigi et al. [ 26 , 28 ] introduce a new adversarial attack that does not need to have any background information before initiating the attack. This attack is designed for heterogeneous social media data, which consists of different aspects (i.e., textual and structural) and shows that anonymizing all aspects of data is not sufficient. This attack first extracts the most revealing information for each user in the anonymized dataset and then accordingly finds a set of candidate users. Each user is finally mapped to the most probable candidate user. Sharad et al. [ 164 ] propose to formulate the problem of graph de-anonymization in social networks as a learning task. They use one-hop and two-hop neighborhood degree distributions to represent each user. The intuition behind this selection is that two nodes refer to the same user if their neighborhoods also matches to each other. These features are further used to train a classifier to learn the degree deviation for identical and non-identical user pairs. In another work, Sharad et al. [ 163 ] go even further and propose a new generation of de-anonymization attacks that is heuristic free, seedless, and considered a learning problem. They use the same set of structural features as proposed in Reference [ 164 ] and then de-anonymize the sanitized graph by re-identifying users with high degree first and then use them to attack low-degree nodes. Mappings are then frozen and propagated to the remaining nodes to discover new set of mappings.

Table 5 categorizes reviewed works based on the used technique and the fact that if they are applicable on heterogeneous or homogeneous graph networks.

3.1.3 Theoretical Analysis and De-anonymization. Another set of works studies de-anonymization attacks from the theoretical perspective of view. For example, Liu et al. [ 113 ] theoretically study the vulnerability of differential privacy mechanisms against de-anonymization attacks. Differential privacy provides protection against even the strongest attacks in which the adversary knows the entire dataset except one entry. However, differential privacy assumes the independence between dataset entities that is not correct in most real-world applications. This work introduces a new attack in which the probabilistic dependence between dataset entries are calculated and then leveraged to infer users’ sensitive information from differentially private queries. The attack is also tested on graph data in which users’ degree distributions is published differentially privately.

Lee et al. [ 104 ] also study the theoretical quantification for relating the anonymized graph data vulnerability against de-anonymization attacks. In particular, they study the relation between application specific anonymized data utility (i.e., quality of data) and capability of de-anonymization attacks. They define local neighborhood utility and global structure utility. They theoretically show that under certain conditions for each of defined utilities, the probability of successful de-anonymization approaches one with the increase of number of users in data. Their foundations could be used to evaluate the effectiveness of the de-anonymization/anonymization techniques.

Recent research by Fu et al. [ 58 ] studies the conditions under which the adversary can perfectly de-anonymize user identities in social graphs. In particular, they theoretically study the cost of quantifying the quality of the mappings. Community structures are also parameterized and leveraged as side information for de-anonymization. They study two different cases in which the community information is available for both background knowledge and anonymized graphs or only for one of them. They showed that perfectly de-anonymizing graph data with community information in polynomial time is NP-hard. They further propose two algorithms with approximation guarantees and lower time complexity by relaxing the original optimization problem. The main drawback of this study is the assumption of disjoint communities, which fails to reflect the real-world situations. Wu et al. [ 181 ] extend Fu et al.’s study by considering overlapping communities. In contrast to Fu et al.’s work [ 58 ], which uses Maximum a Posteriori estimation to find the correct mappings, Wu et al. introduces a new cost function, Minimum Mean Square Error, which minimizes the expected number of mismatched users by incorporating all possible true mappings.

There are different surveys [ 1 , 82 , 86 , 104 ] on quantification and analysis of graph de-anomyziation techniques that study a portion of covered works here in terms of scalability, robustness, and practicability. Interested readers can refer to these surveys for further readings.

3.2 Graph Anonymization

Another research direction in protecting privacy of users in graph data is studying graph anonymization techniques. Existing anonymization approaches use different techniques and mechanisms and could be categorized mainly into five categories: $k$ -anonymity-based approaches [ 43 , 115 , 189 , 196 , 199 ], edge manipulation techniques [ 188 ], cluster-based techniques [ 31 , 70 , 114 , 134 , 174 ], random walk-based techniques [ 116 , 134 ], and differential privacy-based techniques [ 152 , 162 , 179 , 182 ]. We discuss each of these categories later.

3.2.1 K-anonymity-based Approaches. The aim of $k$ -anonymity methods is to anonymize each user/node in the graph so that it is indistinguishable from at least $k-1$ other users [ 171 ]. Liu et al. [ 115 ] proposed an anonymization framework for $k$ -degree anonymization in which for each user, there are at least $k$ other users with the same degree. The goal of this approach to add/delete the minimum number of edges to preserve $k$ -degree anonymity. This algorithm has two steps. In the first step, given the degree sequence of the original graph, a $k$ -degree anonymized version of the degree sequence is constructed and then in the second step, the anonymized graph is built based on the anonymized degree sequence. In another work [ 196 ], Zhou et al. aim to achieve $k$ -neighborhood anonymity. They consider the assumption that the adversary knows the subgraph constructed by the immediate neighbors of a target node. In the first step of the anonymization, one-hop neighborhoods of all users are extracted and encoded in a way that isomorphic neighborhoods could be easily identified. In the second step, users with similar/isomorphic neighborhoods are grouped together until size of each group is at least $k$ . Then, each group is anonymized satisfying $k$ -neighborhood anonymity as each neighborhood has at least $k-1$ isomorphic neighborhoods in the same group. Eventually, this approach anonymizes the graph against neighborhood attacks.

Zou et al. [ 199 ] propose a $k$ -automorphism-based framework that protects the graph against multiple attacks including the neighborhood attack [ 196 ], degree-based attack [ 115 ], hub-fingerprint attack [ 70 ], and subgraph attack [ 70 ]. A graph is $k$ -authomorphic if there exists $k-1$ automorphic functions in the graph and for each user in the graph, the attacker cannot distinguish it from her $k-1$ symmetric vertices. The proposed approach first partitions the graph into $n$ blocks and then clusters blocks into $m$ groups (graph partitioning step). In the second step, alignments of blocks are obtained and original blocks are replaced with alignment blocks (block alignment step). In the last step, edge copy is performed to get the anonymized graph. Edge copy adds $k-1$ edges between $k-1$ pairs $(F_a(u), F_a(v)))$ $(a = 1,2,\ldots k-1)$ , where $F_a(.)$ is the automorphic function and $u$ and $v$ are users in the social graph. Authors also propose the use of generalized vertex ID's for handling dynamic data releases. Another similar work, by Cheng et al. [ 43 ], proposes a $k$ -isomorphism anonymization approach. A graph is $k$ -isomorphic if it is consisted of $k$ disjoint subgraphs and all subgraphs pairs are isomorphic. In the first step, the graph is partitioned into $k$ subgraphs with the same number of vertices. Then, edges are added or deleted so that these subgraphs are isomorphic. This approach protects the published graph against neighborhood attacks [ 196 ].

Yuan et al. [ 189 ] incorporate semantic and graph information together to achieve personalized privacy anonymization. In particular, they consider three different levels for attacker's knowledge regarding the target user, (1) only attribute information, (2) both attribute and degree information, and (3) combination of attribute, node degree, and neighborhood's information. They accordingly propose three levels of protection to achieve $k$ -anonymity. For level 1 protection, their approach considers label generalization. For the level 2 anonymization, it uses node/edge adding approach as well. For the level 3 protection, it uses edge label generalization.

3.2.2 Edge Manipulation-based Approaches. Edge manipulation and randomization algorithms for social graphs usually utilizes edge-based randomization strategies to anonymize data such as random edge adding/deleting and random edge switching [ 188 ]. Ying et al. [ 188 ] propose spectrum preserved edge editing that either adds $k$ random edges to the graph and removes another $k$ edge randomly or alternatively switches $k$ edges. In the switching technique, two random edges, $(i_1, j_1)$ and $(i_2, j_2),$ are selected from the original graph edge set $E$ such that $\lbrace (i_1,j_2) \notin E \wedge (i_2,j_1) \notin E \rbrace$ . Then edges $(i_1,j_1)$ and $(i_2,j_2)$ are removed, and new edges $(i_1,j_2)$ and $(i_2,j_1)$ are added instead. This method is going to protect the graph against the edge inference attack. Backes et al. [ 18 ] also propose a randomization-based approach to preserve the privacy of social links between users in graph data and counteract link inference attacks. In this specific type of attack, the adversary exploits users mobility traces to infer social links between users with the intuition that friends have more similar mobility profiles in comparison to the mobility profiles of two strangers [ 18 ]. They utilize three privacy-preserving techniques: hiding, replacement, and generalization of user mobility information. Results show that data publishers need to hide 80% of the location points or replace 50% of them to prevent leakage of information of users social links.

3.2.3 Clustering-based Techniques. Clustering-based approaches group users and edges and only reveal the density and size of the cluster so that individual attributes are protected. Hay et al. [ 70 ] propose an aggregation-based method for graph data anonymization that is robust against three types of attacks: neighborhood, subgraph, and hub fingerprint. It models the aggregate network structure by partitioning original graph and describing it at the level of partitions. Partitions are considered as nodes and edges between them makes the edges in the generalized graph. A graph can be then randomly sampled it and be published as the anonymized graph data.

Another cluster-based work [ 31 ] proposes two approaches, label list and partitioning, which consider user attributes (i.e., labels) in addition to structural information. In the label list approach, a list of labels are allocated to each user that also includes her true label. This approach first clusters nodes into $m$ classes and then a set of symmetric lists is built deterministically for each class from the set of nodes in the corresponding class. In the partitioning approach, nodes are divided into classes and instead of releasing full edge information, only the number of edges between and within each class is released. This is similar to the generalization approach of Hay et al. [ 70 ]. Bhagat et al. also use a set of safety conditions to ensure that the released data do not leak information. The proposed partitioning approach is more robust than the label list technique when facing the attacks with richer background knowledge. However, the partitioning approach has lower utility than the label list as less information is revealed about the graph structure.

Thompson et al.’s approach [ 174 ] protects the graph information against $i$ -hop degree-based attack. They present two clustering algorithms, bounded $t$ -means clustering and union-split clustering. These approaches group users with similar social roles into clusters with a minimum size constraint. Then they utilize the proposed inter-cluster matching anonymization method, which anonymizes the social graph by removing/adding edges according to the users’ inter-cluster connectivity. The number of nodes and edges between and within clusters are then released similar to Hay et al.’s approach [ 70 ]. Mittal et al. [ 114 ] also propose another clustering-based aonymization technique that considers evolutionary dynamics of social graphs such as node/edge addition/deletion and consistently anonymizes the graph. It first dynamically clusters nodes and then perturbed the intra-cluster and inter-cluster links for changed clusters in a way that structural properties of social media graph is preserved. They leverage static perturbation method in Reference [ 134 ] to modify intra-cluster links and randomly connect marginal nodes to create fake inter-cluster links according to their degree. The obfuscated graph is robust against the edge inference attack and has higher indistinguishability that is defined from an information theoretic perspective.

3.2.4 Random Walk-based Approaches. Another group of works utilizes random walk idea to anonymize graph data. The idea of random walk has been previously used in many security applications such as Sybil defense [ 8 ]. Recent works also use this idea for anonymzing social graphs. The work of Mittal et al. [ 134 ] introduces a random-walk-based edge perturbation algorithm. According to this approach, for each node $u$ , a random walk with the length $t$ will be performed starting from one of the $u$ ’s contacts, $v$ and an edge $(u,z)$ between destination node, $z$ and $u$ will be added with an assigned probability, and the edge $(u,v)$ will be removed accordingly. This probability will decrease as more random walks are performed from $u$ ’s contacts. Later, Liu et al. [ 116 ] improve this approach such that instead of having a fixed length random walk with length $t$ , they utilize a smart adaptive random that its length is learned based on the local structure characteristics. This method first predicts the local mixing timing for each node, which is the minimum random walk length for a starting node to be within a given distance to stationary (distance) node. This mixing time is predicted based on the local structure and limited global knowledge of the graph and is further used to adjust the length of random walk for social graph anonymziation.

3.2.5 Differential Privacy-based Approaches. Recently, many works extend differential privacy [ 52 ] to the social graph data. Sala et al. [ 162 ] first use $dK$ -series to capture sufficient graph structure at multiple granularities. $dK$ -series is the degree distributions of connected components of size $K$ within a target graph [ 50 , 122 ]. Then, they partition the statistical representation of the graph captured by $dK$ -series into clusters and use $\epsilon$ -differential privacy mechanism to add noise to the representation in each cluster. Another differentially private-based approach [ 152 ] scales down the magnitude of added noise by reducing the contributions of challenging records.

In another work, Wang et al. [ 179 ] use $dK$ -graph generation models to generate sanitized graphs. In particular, their approach first extracts various information form the original social graph such as degree correlations and then enforces differential privacy on the learned information and, finally, uses perturbed pieces of information to generate an anonymized graph with $dK$ -graph models. Different from the approach in Sala et al. [ 162 ], in the specific case of $d=2$ , noise is generated based on the smooth sensitivity rather than global sensitivity. The reason behind this specification is to reduce the magnitude of the added noise. Smooth sensitivity is a smooth upper bound on the local sensitivity when deciding the noise magnitude [ 143 ]. Another work, Reference [ 182 ], proposes an anonymization approach that satisfies edge $\epsilon$ -differential privacy to hide each user's connections to other users. They propose to learn how to transform edges to connection probabilities via statistical Hierarchal Random Graphs (HRG) under differential privacy. In particular, their approach infers the HRG by learning the entire HRG model space and sampling an HRG by a Markov Chain Monte Carlo method and generating the sanitized graph according to the sampled HRG while satisfying differential privacy. Their results show that using edge probabilities can result in significant noise scale reduction in comparison to the case where the edges are used directly.

In another work, from Liu et al. [ 113 ], it has been shown that differential privacy is not robust to the de-anonymization attacks if there is dependence among dataset entries. Liu et al. [ 113 ] also propose a stronger privacy notion, dependent differential privacy in which it incorporates the probabilistic dependence between the tuples in a statistical database. They then propose an effective perturbation framework that provides privacy guarantees. Their result show that more noise should be added when there is dependency between tuples. The added noise is also dependent on the sensitivity of two tuples as well as the dependence relationship between them. They evaluate their proposed framework on graph data to sanitize the degree distribution of the given graph.

Ji et al. [ 82 , 86 ] and Abajaway et al. [ 1 ] study the defense and attacking performance of a portion of existing social graph anonymization and de-anonymization techniques. Ji et al. [ 82 , 86 ] have also performed a thorough theoretical and empirical analysis on a portion of existing related papers. Results demonstrate that anonymized social graphs are vulnerable to de-anonymization attacks.

To sum up, Table 6 categorizes reviewed works with respect to the utilized technique, i.e., $k$ -anonymity, edge manipulation, cluster based, random walk based, and differential privacy based. Each column in Table 6 refers to the type of graph de-anonymization attack and correspondingly the works that are robust against the mentioned attack.

4 AUTHORS IN SOCIAL MEDIA AND PRIVACY

People have the right to have anonymous free speech over different topics such as politics. However, an author's identity can be unmasked by adversaries through providing her real name or IP address to a service provider. However, authors can use tools such as Tor to protect their identity at the network level. Manually generated content will always reflect some characteristics of the person who authored it. For example, some anonymous online author is prone to several specific spelling errors or has other recognizable idiosyncrasies [ 137 ]. These characteristics could be enough to figure out whether authors of two pieces of content are same or not. Therefore, with material authored by the true identity of the author, the adversary can discover the identity of a content posted online by the same author anonymously. Identifying the author of a text according to her writing style, a.k.a. stylometry, has been studied for a long time [ 135 , 169 ]. With the adverse of machine learning techniques, researches start to extract textual features and discriminate between 100 and 300 authors [ 2 ]. The application of author identification includes identifying authors of terroristic threats and harassing messages [ 42 ], detecting fraud [ 3 ], and extracting demographic information [ 95 ].

Privacy implications of stylometry have been studied recently. For example, Rao et al. [ 156 ] investigate whether people who are posting under different pseudonyms to USENET newsgroup can be linked based on their writing style. They use a dataset of 117 people having 185 different pseudonyms and exploit function words and Principal Component Analysis (PCA) to perform matching between newsgroups posting and email domains. Another work, from Koppel et al. [ 96 , 97 ], studies author identification at the scale of over 10,000 blog authors. They use 4-grams of characters, which is a context specific feature. The problem with this work is that it is not clear whether their approach is solving author recognition or context recognition. In another work, Koppel et al. [ 95 ] use both content-based and stylistic features to identify 10,000 authors in the blog corpus dataset. There are also several works on identifying authors of academic papers under blind review based on the citations of the paper [ 37 , 73 ] or other sources from unblind texts of potential authors [ 136 ].

Narayanan et al. [ 137 ] propose another author identification attack that exploits 1,188 real-valued features from each post, such as frequency of characters, capitalization of words, syntactic structure (extracted by Stanford Parser [ 93 ], e.g., noun phrases containing a personal pronoun, noun phrases containing a singular proper noun), and distribution of word length. These features capture the writing style of the author regardless of the topic at hand and can re-identify large number of authors. However, this approach will not work when authors anonymize their writing style. Almishari et al. [ 10 ] proposed a new linkage attack that investigates the linkability of prolific reviews that users post on social media platforms. More specifically, given a subset of information on reviews made by an anonymous user, this approach seeks to map it to a known identified record. This approach first extracts four types of tokens: (i) unigrams, (ii) digrams, (iii) ratings, and (iv) category of reviewed entity. Then, it uses Naive Bayes and Kullback–Leibler (KL) divergence models to re-identify the anonymized information. This approach could be also used for identity disclosure attack across multiple platforms using people's posts and reviews.

Bowers et al. [ 36 ] propose an anonymization approach that uses iterative language translation to conceal one's writing style. This approach first translates English text into another foreign language (e.g., Spanish, Chinese, etc.) and then turns it back to English again for three iterations. Another work, from Nathan et al. [ 121 ], evaluates Bowers's work by introducing a feature selection approach, namely Generative and Evolutionary Feature Selection (GEFES), over the set of predefined features that mask out non-salient previously extracted features. Both Reference [ 36 ] and Reference [ 121 ] are tested over a set of blog posts by users and the results show the efficiency of ILT-based anonymization. A recent work is also proposed by Zhang et al. [ 191 ] that anonymizes users’ textual information before publishing user-generated data. This approach first introduces a verified version of differential privacy specified for textual data, namely, $\epsilon$ -Text Indistinguishability, to overcome the curse of dimensionality problem when original differential privacy is deployed on high-dimensional textual data. It then proposes a framework that perturbs user-keyword matrix by adding Laplacian noise to satisfy $\epsilon$ -Text Indistinguishability. Results confirms both the utility and privacy of the data.

5 SOCIAL MEDIA PROFILE ATTRIBUTES AND PRIVACY

A user's profile includes her self-disclosed demographic attributes such as age, gender, majors, cities she loved, and so on. To address the privacy of users, social networks usually offer the option for users to limit the access to their attributes, i.e., they are only visible to friends or friends of friends. A user could also create a profile without explicitly disclosing any attribute information. A social network thus is a mixture of both private and public user information. However, there exists one privacy attack that focuses on inferring users’ attributes. This attack is known as attribute inference attack and it leverages publicly available information of users in social networks to infer missing or incomplete attribute information [ 63 ].

The attacker could be any party who is interested in this information such as social network service providers, cyber criminals, data brokers, and advertisers. Data brokers benefit from selling individuals’ information to other parties such as banks, advertisers, and insurance companies. 1 Social network providers and advertisers leverage users’ attribute information to provide more targeted services and advertisements. Cyber criminals exploit attribute information to perform targeted social engineering, spear phishing, 2 and backup authentication attacks [ 68 ]. This attribute information could be also used for linking users across multiple sites [ 62 ] and records (e.g., vote registration records) [ 132 , 171 ]. Existing attacks could be categorized into three groups: friend based, behavior based, and friend and behavior based.

5.1 Friend-based Profile Attribute Inference

Friend-based approaches use homophily theory [ 127 ], which states that two friends are more probable to share similar attributes rather than two strangers. Following this intuition, if most of a user's friends study at Arizona State University, then she is more likely studying in the same university. He et al. [ 71 ] first constructs a Bayesian network from a user's social neighbors and then uses it to model the causal relations among people in the network and thus obtains the probability that the user has a specific attribute. The main challenge in this approach is its scalability as Bayesian inference is not scalable to the millions of users in social networks. Another work, by Lindamood et al. [ 111 ], uses Naive Bayes classification algorithm to infer a user's attributes by exploiting features from her node trait (i.e., other available attributes information) and link structures (i.e., friends). However, this approach is not usable for a user who does not share any attributes. In the other work, Reference [ 173 ], the authors propose an approach that leverages friends’ activities and information to infer a user's attributes. These features from friends and wall posts are then exploited into a multi-label classifier. The authors then propose a multi-party privacy approach that defends against attribute inference attacks. This approach enforces mutual privacy requirements for all users to prevent disclosure of users attributes and sensitive information.

Zhelva et al. [ 192 ] study how users sensitive attribute information could be leaked through their social relations and group memberships. This friend-based attribute inference attack exploits social links and group information to infer sensitive attributes for each user. Authors propose various algorithms in which it was found LINK was the best among those that only use link information. This method models each user $u$ as a binary vector whose length is the size of the network (i.e., number of users in the network) and the value of each element $v$ is one if $u$ is connected to $v$ . Then, different classifiers are trained over the users with a public profile and then attributes for users with private profiles could be inferred. The GROUP algorithm was the best among the methods that incorporates group information. This method first selects the groups that are relevant to the attribute inference problem using either feature selection approach (i.e., entropy) or manually. Next, relevant groups are considered as features for each node and a classifier model is trained. In the last step, the attributes for targeted users are predicted using the classification model. Mislove et al. introduces a similar approach that leverages users’ social links and communities information [ 133 ]. Their approach takes some seed users with known attributes as the input and then finds the local communities around this seed set using available link information. Then it uses the fact that users in the same community share similar attributes. This approach then infers remaining users’ attributes based on the communities they are a member of. The limitation is that this approach is not able to infer attributes for users who are not assigned to any local communities.

Avello et al. [ 61 ] propose a semi-supervised profiling approach named McC-Splat. They consider the attribute inference problem as a multiclass classifier. It then learns the attributes’ weights according to the user's friends’ attributes. Weights here indicate the users’ likelihood in belonging to a given attribute value class. Finally, McC-Splat assigns the class with the highest percentile to the target user. The percentile is calculated according to the labeled individuals information. In the other work, from Dey et al. [ 49 ], the authors focus on predicting facebook users’ ages considering their friendship network information. Although a user's friends list is not fully available for all users, this work uses reverse lookup approach to obtain a partial friend list for each user. Then, they designed an iterative algorithm that estimates users’ ages based on friends’ ages, friends of friends’ ages and so on. They also incorporated other public information in each user's profile such as their high school graduation year to estimate their birth year. Another work, Reference [ 77 ], seeks to find a targeted user based on her social network connections and the similarity of attributes between friends. It starts from a source user and continue crawling until it reaches the target user. The navigations are based on the set of target user's known attributes, friendship links between users and their attributes as well. Similarly, Labitzke et al. [ 102 ] also study whether profile information of Facebook users could be still leaked through their social relations. A recent work published by Li et al. [ 110 ] uses convolutionam neural network (CNN) to infer multi-valued attributes for a target user according to his ego network. A user's ego network is a subset of the original social network based on the user's friends and the social relations among them. CNN can capture the latent relationship between users’ attributes and social links.

Another set of works in this category focuses on predicting both network structure (i.e., links) and inferring missing users attribute information [ 65 , 186 , 187 ]. The reason for simultaneously solving these two problems is that users with similar attributes tend to link to one another and individuals who are friends are likely to adopt similar attributes. The work of Yin et al. [ 186 , 187 ] first creates a social-attribute network graph from an original social graph and user-attributes information, i.e., nodes in the graph are either users or attributes. Edges show the friendship between a pair of users or the relation between a user and attribute. Then, authors use random walk with restart algorithm [ 175 ] to calculate link relevance and attribute relevance with regard to a given user. Similarly, Gong et al. [ 65 ] transform the attribute inference attack problem to a link prediction problem in the social-attribute network graph. They generalized several supervised and unsupervised link prediction algorithms to predict the links between user-user and user-attributes.

5.2 Behavior-based Profile Attribute Inference

Unlike friend-based approaches, behavior-based inference attacks infer a user's attributes based on the publicly available information regarding her behaviors and public attributes of other users similar to her. Weinsberg et al. [ 180 ] propose an approach that infers users’ attributes (i.e., gender) according to their behavior toward movies. In particular, each user is modeled with a vector with the size being the number of items. A non-zero value for each vector element demonstrates that the user has rated the item, and zero value means that user has not rated the item. Then, they use different classifiers such as logistic regression, SVM, and Naïve Bayes to infer users’ ages. Accordingly, the authors propose a gender obfuscation method that adds movies and corresponding ratings to a given user's profile such that it will be hard to infer the gender of the user while minimally impacting the quality of recommendations the user received. They use three different approaches for movie selection: random, sampled, and greedy strategy. The sampled strategy picks a movie based on ratings distribution associated with the movies of the opposite gender. The greedy approach also selects a movie with the highest score in the list of movies for opposite gender. Ratings are also added for each movie based on either the average movie rating or the rating predicted using recommendation approaches such as matrix factorization. The greedy movie selection approach with predicted rating has the best results regarding user profile obfuscation. Kosinski et al. [ 100 ] follow a similar approach to Reference [ 180 ] and construct a feature vector for each user based on Facebook likes. Authors then use logistic regression classifier to infer various attributes for each user.

Another work, from Bhagat et al. [ 32 ], proposes an active learning-based attack that infers users’ attributes via interactive questions. In particular, their approach involves finding a set of movies and asking users to rate them. Each selection maximizes the confidence of the attacker in inferring users attributes. The work of Reference [ 41 ] seeks to infer users attributes based on the different types of music they like. This approach first extracts a user's interests and finds semantic similarity among them. It uses an ontologized version of Wikipeda related to each type of music, exploits topic modeling techniques (i.e., Latent Dirichlet Allocation, LDA [ 34 ]), and learns semantic interest topics for each user. Then, a user is predicted to have similar attributes as those who like similar types of musics as the user. In another work, from Luo et al. [ 117 ], authors infer household structures of Internet Protocol Television based on the users’ watching behavior. Their approach first extracts related features from log-data including TV programs topics and viewing behavior using LDA and low-rank model, respectively. Then, it combines graph-based semi-supervised learning with non-parametric regression and uses it to learn a classifier for inferring the household structure.

5.3 Friend and Behavior–based Profile Attribute Inference

Another category of approaches exploit both social link and user behavior information for inferring users attributes. Gong et al. [ 63 , 64 ] first make a social-behavior-attribute network (SBA) in which social structures, user behaviors, and user attributes are integrated into a unified framework. Nodes of this graph are users, behaviors, or attributes, and edges represents the relationship between these attributes. Then, they infer a target user's attributes through a vote distribution attack (VIAL) model. VIAL performs a customized random walk from a target user to all other users in the augmented SBA network and assigns probabilities to the users such that a user receives higher probability if it is structurally more similar to the target node in SBA network. The stationary probabilities of attribute nodes are then used to infer attributes of the target user, i.e., the attribute with maximum probability is assigned to the target user. Unlike most of the existing approaches that only use the information of users who have an attribute, a recent work from Ji et al. [ 88 ] incorporates information from users who do not have the attribute in the training process as well, i.e., negative training samples. This work associates a binary random variable with each user characterizing whether a user has an attribute or not. Then it learns the prior probability of each user having a specified attribute by incorporating the user's behavior information. Next, it models the joint probability of users as a pairwise Markov Random Field according to their social relationships and uses this model to infer posterior probability of attributes for each target user.

5.4 Exploiting Other Sources of Information for Profile Attribute Inference

These approaches leverage sources of information other than social structures and behaviors, such as writing style [ 144 ], posted tweets [ 9 ], liked pages [ 68 ], purchasing behavior [ 178 ], and checked-in locations [ 195 ]. A recent research combined identity and attribute disclosure across multiple social network platforms [ 16 ]. It defines the concept of $(\theta ,k)$ -matching anonymity as a measure of identity disclosure risk. Given a user and her identity in a source social network, a matching anonymity set is defined as the set of identities in the target social network with a matching probability of more than $\theta$ . The user is $(\theta ,k)$ anonymous if the size of the matching set is $k$ . Another work, by Backes et al. [ 17 ], introduces a relative linkability measure that ranks identities within a social media site. In particular, it incorporates the idea of $k$ -anonymity to define $(k,d)$ -anonimity for each user $u$ in social media that captures the largest $k$ subset of identities (including $u$ ) who are within a similarity (or dissimilarity) threshold $d$ from $u$ considering their attributes. A recent work from Liu et al. [ 113 ] also studies the vulnerability of differential privacy mechanism against the inference attack problem. As stated earlier, differential privacy provides protection against the adversary who knows the entire dataset except one entry. However, differential privacy considers the independence between dataset entities. Liu et al. introduce a new inference attack in which the probabilistic dependence between dataset entries are calculated and then leveraged to infer a user's location information from differentially private queries.

Different from all the works focusing on profile attribute inference, a recent work, Reference [ 11 ], brings evasion and poisoning attacks into this problem. This work introduces five variants of evasion and poisoning attacks to interfere with the results of the profile attribute inference:

  • Good/Bad Feature Attack (Evasion) : The adversary adds good features from one attribute to another while removing bad features from each class to introduce false signals for the predictor.
  • Mimicry Attack (Evasion) : Adversary samples a set of users from one class and then finds the most similar users in the other class. Good (bad) features are added (removed) for users in the found subsets.
  • Class Altering Attack (Poisoning) : Adversary randomly chooses users from one class and then flips their class label. This results in higher misclassification rate.
  • Feature Altering Attack (Poisoning) : The goal is to increase the misclassification rate. She poisons the training data by randomly adding good feature values of one class to another class.
  • Fake Users Addition Attack (Poisoning) : The attacker poisons the data by removing a set of real users and then injecting fake users into the training dataset.

Table 7 summarizes existing works based on the technique they have used and the type of information leveraged for attribute inference attacks. Utilized techniques could be categorizes into different groups: community and clustering-based, random walk-based, graphical model-based, iterative-based, active learning-based, semi-supervised-based, and traditional supervised methods.

6 SOCIAL MEDIA USERS LOCATION AND PRIVACY

This location disclosure attack is a specific version of attribute inference attack in which the adversary focuses on inferring geo-location information for a given user. The location disclosure attack takes as input some geolocated data and produces some additional knowledge about target users. More precisely, the objective of this attack may be to (1) predict the movement patterns of an individual, (2) learn the semantics of the target user mobility behavior, (3) link records of the same individual, and (4) identify points of interest [ 60 ]. Existing works incorporate a given user's friends’ known geo-location information [ 20 , 47 , 90 , 91 , 94 , 125 , 126 , 160 ]. The work of Reference [ 20 ] introduces a probabilistic model representing the likelihood of the target user's location based on her friends’ location and geographic distance between them. Reference [ 94 ] and Reference [ 126 ] extend Backstrom et al.’s work [ 20 ] and find the target user's friends that are strong predictors of her location.

In another work, Mcgee et al. [ 125 ] integrates social tie strength information to capture the uncertainty across multiple location granularities. The reason is that not all relationships in social media are the same and the location of friends with strong ties are more revealing of a user's location. Rout et al. [ 160 ] deploy a SVM classifier on a given set of features to predict the target user's location. These features include cities of the target user's friends, number of friends in the same city as the target user and number of reciprocal relationships the target user has per city. Jurgens et al. [ 90 ] infer locations by proposing an iterative multi-pass label propagation approach. This approach calculates each target user's location as the geometric median of her friends’ locations and it seeks to overcome the sparsity problem when the ground truth data is sparse. The work of Reference [ 47 ] extends Reference [ 90 ] and limits the propagation of noisy locations by weighting different locations using information such as the number of times the users have interacted.

Another work, from Cheng et al. [ 44 ], proposes a probabilistic framework that infers Twitter users’ city level location based on the content of their tweets. The idea is that users’ tweets include either implicit or explicit location-specific content, e.g., place names, or words or phrases more associated with certain locations (e.g., “howdy” for Texas). It uses lattice-based neighborhood smoothing technique to even out the word probabilities and overcome the tweet sparsity challenge. Hecht et al. [ 72 ] also found that only $34\%$ of Twitter users do not provide their real location information or share fake locations or sarcastic comments to fool location inference approaches. They show that a user's location could be inferred using machine learning techniques through the implicit user behavior reflected in their tweets. In another work, Ryoo et al. [ 161 ] refine Cheng et al.’s city-level granularity location inference approach [ 44 ] to 500-m distance bins. Having GPS-tagged tweets for a set of users, their approach builds geographic distributions of words and computes user location as a weighted center of mass from the user's words. It then uses a probabilistic model and computes the foci and dispersions by binning the distance between GPS coordinates and the word's center by 500 m for computational scalability.

Li et al. [ 109 ] introduce a unified discriminative influence model that considers both users’ social network and user-centric data (e.g., tweets) to solve the scarce and noisy data challenge for location inference. It first augments social network and user data in a probabilistic framework that is viewed as a heterogeneous graph with users and tweets as nodes and social and tweeting relations as edges. Every node in this graph is then associated with a location and the proposed probabilistic influence model measures how likely an edge is generated between two nodes considering their locations. Another similar work, from Li et al. [ 108 ], exploits a user's tweets and social relations to build a complete location profile that infers a set of multiple long-term geographic location scopes related to her, which not only includes her home location, but also other related ones, e.g., work space. Their approach captures the user's friends’ locations as well.

Srivatsa et al. [ 168 ] propose a de-anonymization attack that exploits a user's friendship information in social media to de-anonymize users mobility traces. The idea behind this approach is that people meet those who have a relationship with them and thus they could be identified by their social relationships. This approach models mobility traces as contact graphs and identifies a set of seed users in both graphs, i.e., contacts graph and friendship in social network. In the second step, it propagates mapping from seed users to the remaining users in the graphs. This approach uses Distance Vector, Randomized Spanning Trees, and Recursive Subgraph Matching heuristics to measure the mapping strength and propagate the measured strength through the network.

Another work, from Ji et al. [ 85 ], improves the work of Srivasta et al. [ 168 ] in terms of accuracy and computational complexity. This work focuses on mapping anonymized users mobility traces to social media accounts. In addition to the users’ local features, their approach incorporates users’ global characteristics as well. Ji et al. define three similarity metrics: structural similarity, relative distance similarity, and inheritance similarity. These similarities are then combined in a unified similarity. Structural similarity considers features such as degree centrality, closeness centrality, and betweenness centrality while relative distance similarity captures the distance between users and seed users. Inheritance similarity considers the number of common neighbors that have been mapped as well as the degree similarity between the users in mobility traces and social media network graph. Next, Ji et al. [ 85 ] propose an adaptive de-anonymization framework that adaptively starts de-anonymizing from a core matching set that is consisted of a number of mapped users and $k$ -hop mapping spanning set of them.

In another work, Reference [ 123 ], the location of Twitter users are inferred in different granularities (e.g., city, state, time zone, geographical region) based on their tweeting behavior (frequency of tweets per time unit) and the content of their tweets. This approach exploits external location knowledge (e.g., dictionary containing names of cities and states, and location-based services such as Foursquare) and finds explicit references of locations in tweets. Then all features are fed into a dynamically weighted method that is an ensemble of the statistical and heuristic classifiers.

Another work, from Wang et al. [ 177 ], links multiple users identities across multiple services/social media platforms (even with different types) according to the spatial-temporal locality of their activities, i.e., users mobility traces. This work also assumes that individuals can have multiple IDs/accounts. The motivation behind their algorithm is that IDs corresponding to the same person, are online at the same time in the same location and users’ daily movement is predictable with repeated patterns. Wang et al. model users information as a contact graph where nodes are IDs (regardless of the service) and an edge represents connected IDs that have visited the same location. The weight of the edge demonstrates the number of co-location of two nodes. Then, a Bayesian matching algorithm is proposed to find the most probable matching candidates for a given target ID. A Bayesian inference method is then used to generate confidence scores for ranking candidates.

The work of Reference [ 91 ] compares different approaches in location inference attacks in social networks. There are also some other surveys discussing location inference techniques specifically in Twitter [ 7 , 194 ], to which the reader can refer. Note that a large portion of research is dedicated to inference attacks on geolocated data, which is out of the scope of this survey [ 60 , 112 , 167 ]. A thorough survey is also available discussing geolocation data privacy, to which readers can refer if they are interested [ 112 ]. Note that the scope of this survey is a different from ours in which we cover the location privacy issues of users based on activities in social media.

In conclusion, location inference attack uses three types of information, (1) a user's network, (2) a user's contextual, and (3) a user's network and contextual information. A summary of the existing works is represented in Table 8 based on the type of leveraged information and used technique.

7 RECOMMENDATION SYSTEMS AND PRIVACY

Recommendation systems help individuals find information that matches with their interests by building user-interest profiles and recommending items to users based on those profiles. These profiles could be extracted from the users’ interactions as they express their preferences and interests, e.g., clicks, likes/dislikes, ratings, purchases, and so on [ 25 ]. While user profiles help recommender systems to improve the quality of the services a user receives (a.k.a. utility), they also raise privacy concerns by reflecting the preferences of users [ 155 ]. Many works have studied the relationship between privacy and utility and have proposed solutions to handle the tradeoff. In general, these works focus on obfuscating users’ interactions to hide their actual intentions and prevent accurate profiling [ 153 , 157 ]. Following this strategy, no third parties or external entities need to be trusted by the users to preserve their privacy. Existing approaches use different techniques and mechanisms and could be categorized mainly into three categories: cryptographic-based techniques [ 6 , 21 , 40 , 74 , 172 ], differential privacy-based approaches [ 66 , 76 , 89 , 120 , 128 , 130 , 166 , 197 , 198 ], and perturbation-based techniques [ 75 , 118 , 146 , 147 , 148 , 151 , 153 , 158 , 183 ]

A group of works focus on providing cryptographic solutions to the problem of secure recommender systems. The approaches do not let the single trusted party have access to everyone's data [ 6 , 21 , 40 , 74 , 172 ]. Instead, users’ ratings are stored as encrypted vectors and aggregates of the data are provided in the public domain. These approaches do not prevent privacy leaks through the output of recommendation systems (i.e., the recommendation themselves). These techniques are not the scope of this survey. Interested readers can refer to the mentioned papers for more details.

7.1 Differential Privacy-based Solutions

Works in this group utilize a differential privacy strategy to either anonymize user data before sending it to the recommendation system or perturb the recommendation outputs. McSherry et al. [ 128 ] modify leading algorithms for recommendation systems (i.e., SVD and $k$ -nearest neighbor) for the first time so that drawing inferences about original ratings is difficult. They utilize differential privacy to construct private covariance matrices and make the collaborative filtering algorithms that use them private without having significant loss in accuracy.

In another work, Calandrino et al. [ 39 ] propose a new passive attack on recommender systems to infer a target user's transactions (i.e., item ratings). Their attack first monitors changes in the public outputs of a recommender system over a period of time. Public outputs may include related-items lists or an item-item covariance matrix. Then, it combines this information with a moderate amount of auxiliary information about the target user's transactions to further infer many of the target user's unknown transactions. Calandrino et al. further introduce an active inference attack on $k$ -NN recommender systems. In this attack, $k$ sybil users accounts are created and the $k$ nearest neighbor of each sybil consists of $k-1$ other sybil users and the target user. The attack can then infer the target user's transactions history based on the items recommended to any of the sybils. Results confirm the existence of privacy risks over the public outputs of recommender systems. The work of McSherry et al. [ 128 ] is not effective in protecting users against this attack as it does not consider updates to the covariance matrices and cannot provide a privacy guarantee in the dynamic settings. Machanavajjhala et al. [ 120 ] then quantifies the accuracy-privacy tradeoff. In particular, they prove lower bounds on the minimum loss in accuracy for recommendation systems that utilize differential privacy. Moreover, they adapt two differentially private algorithms, Laplace [ 53 ] and Exponential [ 129 ], to prevent disclosure of users’ private attributes.

Previous works [ 120 , 128 ] are vulnerable to $k$ -nearest neighbor attack, as they fail to hide similar neighbors [ 39 ]. Zhu et al. [ 197 ] also propose a private neighborhood-based collaborative filtering that protects the information of both neighbors and user ratings. The proposed work assumes that the recommender system is trusted and introduces two operations: private neighbor selection and recommendation-aware sensitivity. The first operation seeks to protect neighbors identity by privately selecting $k$ neighbors from a list of candidates and then adopting the exponential mechanism [ 129 ] to arrange a probability for each candidate. The second operation enhances the utility by reducing the magnitude of added noise. To do so, after selecting $k$ neighbors, the similarity of neighbors is then perturbed by adding Laplace noise to mask the ratings given by a certain neighbor. Finally, the neighborhood collaborative filtering-based recommendation is performed on the private data. In another work, Jorgensen et al. [ 89 ] assume that all users’ item-rating attributes are sensitive. However, different from Machanavajjhala et al. [ 120 ], they assume that users’ social relations are non-sensitive. They propose a differentially private-based recommendation that incorporates social relations besides user-item ratings. To address the utility loss, this work first clusters users according to their social relations. Then, noisy averages of the user-item preferences are computed for each cluster using the differential privacy mechanism.

Shen et al. [ 166 ] assume that the recommender system is untrusted. They propose a user perturbation framework that anonymizes user data under a novel mechanism for differential privacy: relaxed admissible mechanism. The users’ perturbed data is then used for recommendation. They provide mathematical bounds on the privacy and utility of the anonymized data. Hua et al. [ 76 ] also propose a differentially private matrix factorization-based recommender system. In particular, they solve this problem for two scenarios, trusted recommender and untrusted recommender. For the first scenario, user and item profile vectors are learned via regular and private version of matrix factorization, respectively. Private version of matrix factorization adds noises to item vectors to make them differentially private. In the second scenario, item profile vectors are first differentially privately learned with private matrix factorization problem. Then, a user's differentially private profile vector is derived from the private item profiles. A novel and strong form of differential privacy, namely, distance-based differential privacy, has been introduced by Guerroaui et al. [ 66 ]. Distance-based differential privacy ensures privacy for all the items rated by a user and the ones that are within a distance $\lambda$ from it. The distance parameter $\lambda$ controls the level of privacy and aids in tuning the recommendation privacy-utility tradeoff. The proposed protocol first finds a group of similar items for each given item. Then, it creates a manipulated user profile to preserve $(\epsilon , \lambda)$ -differential privacy by selecting an item and replacing it with another one.

Another differential privacy-based recommendation by Zhu et al. [ 198 ] proposed two approaches to solve the privacy problem in recommendation systems: item-based and user-based recommendation algorithms. In the item-based one, the exponential mechanism [ 129 ] is applied to the selection of the related items to guarantee differential privacy. Such resultant differentially private items list is further used to find recommendation for a given user. Similar private procedure happens in the user-based recommendation system. Another work differentiates sensitive and non-sensitive ratings to further improve the quality of recommendation systems in the long run [ 130 ]. Meng et al. [ 130 ] propose a personalized privacy-preserving recommender system. Given sets of sensitive and non-sensitive ratings for each user, their approach utilizes differential privacy [ 52 ] to perturb users’ ratings. Smaller and larger privacy budgets are considered for sensitive and non-sensitive ratings, respectively. This protects users’ privacy while retaining recommendation effectiveness.

7.2 Perturbation-based Solutions

Perturbation-based techniques usually obfuscate users item ratings by adding random noise to the user data. Rebollo et al. [ 158 ] propose an approach that first measures the user's privacy risk as the KL divergence [ 48 ] between user's apparent profile and average population's distribution profile. The idea is that the more a user's profile diverges from the general population, the more information an attacker can learn about her. Then it seeks to find the obfuscation rate for generating forged user profiles so that the privacy risk is minimized. A closed-form solution is also provided for perturbing users interactions to optimize the privacy risk function.

Puglisi et al. [ 153 ] further extend Rebollo et al.’s work [ 158 ] to investigate the impact of this technique on content-based recommendation. This work measures a user's privacy risk similar to the approach proposed in Reference [ 158 ]. The utility is also measured by the prediction accuracy of the recommender system. It evaluates three different strategies, namely: optimized tag forgery [ 157 ], uniform tag forgery and TrackmeNot (TMN) [ 75 ]. The uniform tag forgery method assign forged tags according to a uniform distribution across all categories of the user profile. TMN constructed eleven categories from Open Directory Project (ODP) classification scheme 3 and selected the tags uniformly from this set. According to this work, users tend to mimic the profile of the population distribution when larger values of obfuscation rate is considered that results in less privacy risk but lower utility rate. Moreover, the authors have found that for a small forgery rate, it is possible to obtain an increase in privacy against a small degradation of utility.

Polat et al. [ 151 ] use a randomized perturbation technique [ 5 ] to obfuscate user generated data. Each user generates the disguised z-score for the item he has rated. The z-score for each user-item pair is based on the original item-rating, the user's average ratings and the total number of items she has rated. The proposed approach then passes the perturbed private data to the collaborative filtering-based recommender system. Another work, Reference [ 146 ], obfuscates user rating information and then passes disguised information to the collaborative filtering system for further recommendation. The proposed Nearest Neighbor Data Substitution (NeNDS) obfuscation method substitutes a user's data elements with one of her neighbors in the metric space [ 145 ]. However, one drawback of NeNDS is that the value of the perturbed data could be close enough to the original value that thus makes the data vulnerable. A hybrid version of NeNDS is then proposed that provides stronger privacy by geometrically transforming data before passing to NeNDS.

In contrast to Mcsherry et al. [ 128 ], Xin et al, assume that the recommender is not trusted and the onus is on the users to protect their privacy [ 183 ]. Their approach separates the computations that can be done by the users locally and privately and those that must be done by the recommender system. In particular, item features are learned by the system and user features are obtained locally by the users and further used for recommendation. Their approach also divides users into two groups, users who publicly share their information, and those who keep their preferences private. It then uses information of users in the first group to estimate items features. Xin et al. show theoretically and empirically that having the public information of a moderate number of users with a high number of ratings is enough to have an accurate estimation. Moreover, they propose a new privacy mechanism that privately releases second order information that is needed for estimating item features. This information is extracted from users who keep their preferences private. The main assumption behind this work is not realistic though, as in a real-world scenario it is not easy to collect ratings of a moderate number of people with a high number of ratings.

Luo et al. [ 118 ] propose a perturbation-based group recommendation method that assumes that similar users are grouped to each other and they are not willing to expose their preferences to anybody other than the group members. The items will be then recommended to the users within the same group. Particularly, in the first step, users are required to exchange their rating data among users in the same group given a secret key. This key varies for different users. The output of this step is a fake preference vector for each user. The value of the rating is then obfuscated in the second step by a chaos-based scrambling method. Similarly to Polat et al. [ 151 ], randomness is added to the output of the previous step to make sure no sensitive information remains in the published data. This information is then sent to the recommender system and it iteratively extracts information about aggregated ratings of the users. Extracted information is then used to estimate a group preference vector for collaborative filtering-based recommendation.

Parra-Arnau et al. [ 148 ], propose a privacy enhancing technology framework, PET, which perturbs users preferences information by combining two techniques, namely, the forgery and the suppression of ratings. In this scenario, users may avoid rating items they like and instead rate those that do not reflect their actual preferences. Therefore, the apparent profile of users will be different from their actual profile. Similarly to Reference [ 158 ], the privacy risk of each user is then measured as the KL divergence [ 48 ] between the user's apparent profile and the average population distribution. Utility is also controlled with the forgery and suppression rates. The tradeoff among privacy, forgery and suppression rate is then modeled as an optimization function to infer which ratings for each user should be forged and which ones should be suppressed to achieve the minimum privacy risk while keeping the utility of the data as high as possible. Similarly, Parra-Arnau et al. [ 147 ] propose a system that perturbs user rating profile according to her privacy preferences. The system has two components: (1) a profile-density model in which the user's profile will be more similar to the crowd's and (2) a classification model in which the user will not be identified as a member of a given group of users. The proposed model optimizes the tradeoff between privacy and utility and decides whether each service provider can have access to the user's profile, or not.

Recently, Biega et al. [ 33 ] proposed a framework that scrambles the users’ rating history to preserve both their privacy and utility. The main assumption of this article is that recommender systems do not need the complete and accurate user profiles. Therefore, it splits users’ profiles (i.e., pairs of user-item interactions) to Mediator Accounts (MA) in a way that coherent pieces of different users’ profiles are kept intact in the MAs. The recommender will then deal with the MAs rather than real user profiles. This helps to preserve users’ privacy by scrambling the user data across various proxy accounts while it keeps the user utility high as possible as well. Another work, from Guerraoui et al. [ 67 ], introduces metrics for measuring the utility and privacy effect of a user's behavior such as clicks, and likes/dislikes. Then, it shows that there is not always a tradeoff between utility and privacy. This article also proposed a click-advisor platform that warn users regarding the status of their click w.r.t. the privacy and utility. Here utility is defined as the difference between commonality of a user profile before and after a click. The privacy risk of a click is accordingly defined as the difference of the disclosure degree before and after the click.

Last, we summarize the reviewed state-of-the-art works in Table 9 . Columns show the properties of the proposed models. These works protect user-item data (1) before sharing, (2) while processing the data, or (3) after recommending items to the user.

8 SUMMARY AND FUTURE RESEARCH DIRECTIONS

Explosive growth of the Web has not only drastically changed the way people conduct activities and acquire information but also has raised security [ 8 , 14 , 15 ] and privacy [ 26 , 139 ] issues for them. Users are increasingly sharing their personal information on social media platforms. These platforms publish and share user-generated data with third-parties that risks exposing individuals’ privacy. There are two general types of attacks, identity disclosure, and attribute disclosure. Sanitizing user-generated social media data is more challenging than structured data as it is heterogeneous, highly unstructured, noisy, and inherently different from relational and tabular data. In this survey, we review the recent developments in the field of privacy of social media data. We first review traditional privacy models for structural data. Then, we review, categorize, and compare existing methods in terms of privacy models, privacy leakage attacks, and anonymization algorithms. We also review privacy risks that exists in different aspects of social media such as users graph information, profile attributes, textual information and preferences. We categorize relevant works into five groups: (1) social graphs and privacy, (2) authors in social media and privacy, (3) profile attributes and privacy, (4) location and privacy, and (5) recommendation systems and privacy. For each category, we discuss existing attacks and solutions (if any was proposed) and classify them based on the type of data and the used technique. We outline the privacy attacks/solutions in Figure 1 . Figure 2 also depicts the relevant privacy issues w.r.t. the type of social media data.

Fig. 1.

Detecting privacy issues and proposing techniques to protect users privacy in social media is a challenging issue. Most of the existing works focus on introducing new attacks and thus the gap between protection and detection becomes larger. Although a large body of work has emerged in recent years for investigating privacy issues for social media data, the development of tasks in each category is highly imbalanced. Some of them are well studied, whereas others need further investigation. We highlight these tasks in red in Figure 1 and Figure 2 based on privacy issues and user-generated data type, respectively. Below, we present some potential research directions:

  • Protecting privacy of textual information : Textual information is noisy, high-dimensional, and unstructured. It is rich in content and could reveal many sensitive information that user does not originally expose such as demographic information and location. This makes textual data a very important source of information for adversaries and could be exploited in many attacks. We thus need more research for anonymizing users’ textual information to preserve privacy of users against various attacks such as author identification, and profile attribute disclosure.
  • Protecting privacy of profile attribute information : We reviewed many works that introduces privacy risks w.r.t. profile attributes. To the best of our knowledge, there is no work on introducing defense mechanisms against these attacks. One research direction could be in terms of a privacy-preserving tool for users that warns them against their activities and possibility of privacy leakage. Another direction is to propose a privacy protection technique that anonymize user's data before publishing to protect them against private attribute leakage.
  • Privacy of spatiotemporal social media data : Social media platforms support space-time indexed data and users have created a large volume of time-stamped, geo-located data. Such spatiotemporal data has an immense value for understanding users behavior better. In this survey, we review the state-of-the-art re-identification attacks that incorporate this data to breach privacy of users. This information may be used to infer users’ location as well as their preferences and interests in case of recommendation systems. One future research direction could be investigating the role of temporal information in privacy of online users. More research should be done to build anonymization frameworks for protecting users temporal information.
  • Privacy of heterogeneous social media data : User-generated social media data is heterogeneous and consists of different aspects. Existing anonymization techniques assume that it is enough to anonymize each aspect of heterogeneous social media data independently. Beigi et al. [ 28 ] show that this assumption is not correct in practice due to the hidden relations between different aspects of the heterogeneous data. One potential research direction is to examine how different combinations of heterogeneous data (e.g., a combination of location and textual information) are vulnerable to the de-anonymization attack. Another potential direction is to improve anonymization techniques by considering hidden relations between different aspects of the data.

ACKNOWLEDGMENTS

The authors thank Alexander Nou for his help throughout the article.

  • Jemal H. Abawajy, Mohd Izuan Hafez Ninggal, and Tutut Herawan. 2016. Privacy preserving social network data publication. IEEE Commun. Surv. Tutor. 18, 3 (2016), 1974–1997.
  • Ahmed Abbasi and Hsinchun Chen. 2008. Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Trans. Inf. Syst. 26, 2 (2008), 7.
  • Sadia Afroz, Michael Brennan, and Rachel Greenstadt. 2012. Detecting hoaxes, frauds, and deception in writing style online. In Proceedings of the 2012 IEEE Symposium on Security and Privacy (SP’12) . IEEE, 461–475.
  • Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. 2005. Approximation algorithms for k-anonymity. In Proceedings of the International Conference on Database Theory (ICDT'05).
  • Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving data mining. In ACM SIGMOD Record , Vol. 29.
  • Esma Aimeur, Gilles Brassard, Jose M. Fernandez, Flavien Serge Mani Onana, and Zbigniew Rakowski. 2008. Experimental demonstration of a hybrid privacy-preserving recommender system. In Availability, Reliability and Security .
  • Oluwaseun Ajao, Jun Hong, and Weiru Liu. 2015. A survey of location inference techniques on Twitter. J. Inf. Sci. 41, 6 (2015), 855–864.
  • Muhammad Al-Qurishi, Mabrook Al-Rakhami, Atif Alamri, Majed Alrubaian, Sk Md Mizanur Rahman, and M Shamim Hossain. 2017. Sybil defense techniques in online social networks: A survey. IEEE Access 5 (2017), 1200–1219.
  • Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and latent attribute inference: Inferring latent attributes of twitter users from neighbors. In Sixth International AAAI Conference on Weblogs and Social Media (ICWSM'12) .
  • Mishari Almishari and Gene Tsudik. 2012. Exploring linkability of user reviews. In Proceedings of the European Symposium on Research in Computer Security . Springer, 307–324.
  • Yasmeen Alufaisan, Yan Zhou, Murat Kantarcioglu, and Bhavani Thuraisingham. 2017. Hacking social network data mining. In Proceedings of the 2017 IEEE International Conference on Intelligence and Security Informatics (ISI’17) . IEEE, 54–59.
  • Hamidreza Alvari, Alireza Hajibagheri, Gita Sukthankar, and Kiran Lakkaraju. 2016. Identifying community structures in dynamic networks. Soc. Netw. Anal. Min. 6, 1 (2016), 77.
  • Hamidreza Alvari, Kiran Lakkaraju, Gita Sukthankar, and Jon Whetzel. 2014. Predicting guild membership in massively multiplayer online games. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction . Springer, 215–222.
  • Hamidreza Alvari, Elham Shaabani, and Paulo Shakarian. 2018. Early identification of pathogenic social media accounts. In Proceedings of the IEEE Intelligence and Security Informatics (ISI’18) . IEEE.
  • Hamidreza Alvari, Paulo Shakarian, and J. E. Kelly Snyder. 2017. Semi-supervised learning for detecting human trafficking. Secur. Inf. 6, 1 (2017), 1.
  • Athanasios Andreou, Oana Goga, and Patrick Loiseau. 2017. Identity vs. attribute disclosure risks for users with multiple social profiles. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’17) . ACM, 163–170.
  • Michael Backes, Pascal Berrang, Oana Goga, Krishna P. Gummadi, and Praveen Manoharan. 2016. On profile linkability despite anonymity in social media systems. In Proceedings of the ACM on Workshop on Privacy in the Electronic Society .
  • Michael Backes, Mathias Humbert, Jun Pang, and Yang Zhang. 2017. walk2friends: Inferring social links from mobility profiles. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security .
  • Lars Backstrom, Cynthia Dwork, and Jon Kleinberg. 2007. Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th International Conference on the World Wide Web (WWW’07) .
  • Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: Improving geographical prediction with social and spatial proximity. In Proceedings of the 19th International Conference on the World Wide Web (WWW’10) .
  • Shahriar Badsha, Xun Yi, Ibrahim Khalil, and Elisa Bertino. 2017. Privacy preserving user-based recommender system. In Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS’17) . IEEE, 1074–1083.
  • Ghazaleh Beigi. 2018. Social media and user privacy. Arxiv Preprint Arxiv:1806.09786 (2018).
  • Ghazaleh Beigi, Ruocheng Guo, Alexander Nou, Yanchao Zhang, and Huan Liu. 2019. Protecting user privacy: An approach for untraceable web browsing history and unambiguous user profiles. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining . ACM, 213–221.
  • Ghazaleh Beigi, Mahdi Jalili, Hamidreza Alvari, and Gita Sukthankar. 2014. Leveraging community detection for accurate trust prediction. In Proceedings of the ASE International Conference on Social Computing .
  • Ghazaleh Beigi and Huan Liu. 2018. Similar but different: Exploiting users’ congruity for recommendation systems. In Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction . Springer.
  • Ghazaleh Beigi and Huan Liu. 2019. Identifying novel privacy issues of online users on social media platforms by Ghazaleh Beigi and Huan Liu with Martin Vesely as coordinator. ACM SIGWEB Newslett. Article 4 (Winter, 2019), 7 pages. http://doi.acm.org/10.1145/3293874.3293878
  • Ghazaleh Beigi, Suhas Ranganath, and Huan Liu. 2019. Signed link prediction with sparse data: The role of personality information. In Companion Proceedings of the Web Conference 2019 . International World Wide Web Conferences Steering Committee.
  • Ghazaleh Beigi, Kai Shu, Yanchao Zhang, and Huan Liu. 2018. Securing social media user data: An adversarial approach. In Proceedings of the 29th Conference on Hypertext and Social Media . ACM, 165–173.
  • Ghazaleh Beigi, Jiliang Tang, and Huan Liu. 2016. Signed link analysis in social media networks. In Proceedings of the 10th International Conference on Web and Social Media (ICWSM’16) . AAAI Press.
  • Ghazaleh Beigi, Jiliang Tang, Suhang Wang, and Huan Liu. 2016. Exploiting emotional information for trust/distrust prediction. In Proceedings of the 2016 SIAM International Conference on Data Mining . SIAM, 81–89.
  • Smriti Bhagat, Graham Cormode, Balachander Krishnamurthy, and Divesh Srivastava. 2009. Class-based graph anonymization for social network data. Proc. VLDB Endow. 2, 1 (2009), 766–777.
  • Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, and Nina Taft. 2014. Recommending with an agenda: Active learning of private attributes using matrix factorization. In Proceedings of the Recommender Systems Conference (RecSys’14) . ACM.
  • Asia J. Biega, Rishiraj Saha Roy, and Gerhard Weikum. 2017. Privacy through solidarity: A user-utility-preserving framework to counter profiling. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval . ACM, 665–674.
  • David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. In Proceedings of Machine Learning Research (JMLR’03) .
  • Joseph Bonneau, Jonathan Anderson, and George Danezis. 2009. Prying data out of a social network. In Proceedings of the International Conference on Advances in Social Network Analysis and Mining 2009 (ASONAM’09) . IEEE, 249–254.
  • Jasmine Bowers, Henry Williams, Gerry Dozier, and R Williams. 2015. Mitigation deanonymization attacks via language translation for anonymous social networks. Proceedings of the International Conference on Machine Learning (ICML’15) (2015).
  • Joseph K. Bradley, Patrick Gage Kelley, and Aaron Roth. [n.d.]. Author identification from citations. ([n. d.]).
  • Karl Bringmann, Tobias Friedrich, and Anton Krohmer. 2014. De-anonymization of heterogeneous random graphs in quasilinear time. In Proceedings of the European Symposium on Algorithms . Springer, 197–208.
  • Joseph A. Calandrino, Ann Kilzer, Arvind Narayanan, Edward W. Felten, and Vitaly Shmatikov. 2011. ” You might also like:” Privacy risks of collaborative filtering. In Proceedings of the Security and Privacy (SP) . IEEE.
  • John Canny. 2002. Collaborative filtering with privacy via factor analysis. In Proceedings of the SIGIR Conference on Research and Development in Information Retrieval . ACM, 238–245.
  • Abdelberi Chaabane, Gergely Acs, Mohamed Ali Kaafar, et al. 2012. You are what you like! information leakage through users’ interests. In Proceedings of the 19th Annual Network & Distributed System Security Symposium (NDSS) .
  • Carole E Chaski. 2005. Who is at the keyboard? Authorship attribution in digital evidence investigations. International Journal of Digital Evidence 4, 1 (2005), 1–13.
  • James Cheng, Ada Wai-chee Fu, and Jia Liu. 2010. K-isomorphism: Privacy preserving network publication against structural attacks. In Proceedings of the ACM SIGMOD International Conference on Management of Data .
  • Zhiyuan Cheng, James Caverlee, and Kyumin Lee. 2010. You are where you tweet: A content-based approach to geo-locating twitter users. In Proceedings of the Conference on Information and Knowledge Management (CIKM’10) . ACM, 759–768.
  • Carla-Fabiana Chiasserini, Michele Garetto, and Emilio Leonardi. 2016. Social network de-anonymization under scale-free user relations. IEEE/ACM Trans. Netw. 24, 6 (2016), 3756–3769.
  • Carla-Fabiana Chiasserini, Michel Garetto, and Emili Leonardi. 2018. De-anonymizing clustered social networks by percolation graph matching. ACM Trans. Knowl. Discov. Data 12, 2 (2018), 21.
  • Ryan Compton, David Jurgens, and David Allen. 2014. Geotagging one hundred million twitter accounts with total variation minimization. In Proceedings of the 2014 IEEE International Conference on Big Data (Big Data’14) . IEEE, 393–401.
  • Thomas M. Cover and Joy A. Thomas. 2012. Elements of Information Theory . John Wiley & Sons.
  • Ratan Dey, Cong Tang, Keith Ross, and Nitesh Saxena. 2012. Estimating age privacy leakage in online social networks. In Proceedings of the 2012 Proceedings IEEE International Conference on Computer Communications (INFOCOM’12) . IEEE, 2836–2840.
  • Xenofontas Dimitropoulos, Dmitri Krioukov, Amin Vahdat, and George Riley. 2009. Graph annotations in modeling complex network topologies. ACM Trans. Model. Comput. Simul. 19, 4 (2009), 17.
  • George T. Duncan and Diane Lambert. 1986. Disclosure-limited data dissemination. J. Am. Stat. Assoc. 81, 393 (1986), 10–18.
  • Cynthia Dwork. 2008. Differential privacy: A survey of results. In Proceedings of the International Conference on Theory and Applications of Models of Computation . Springer, 1–19.
  • Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Theory of Cryptography Conference . Springer, 265–284.
  • Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and Johannes Gehrke. 2004. Privacy preserving mining of association rules. Inf. Syst. 29, 4 (2004), 343–364.
  • Carla Fabiana, Michele Garetto, and Emilio Leonardi. 2015. De-anonymizing scale-free social networks by percolation graph matching. In Proceedings of the 2015 IEEE International Conference on Computer Communications (INFOCOM’15) . IEEE, 1571–1579.
  • Hao Fu, Aston Zhang, and Xing Xie. 2014. De-anonymizing social graphs via node similarity. In Proceedings of the Annual Conference on the World Wide Web (WWW’14) .
  • Hao Fu, Aston Zhang, and Xing Xie. 2015. Effective social graph deanonymization based on graph structure and descriptive information. ACM Trans. Intell. Syst. Technol. 6, 4 (2015), 49.
  • Xinzhe Fu, Zhongzhao Hu, Zhiying Xu, Luoyi Fu, and Xinbing Wang. 2017. De-anonymization of networks with communities: When quantifications meet algorithms. In Proceedings of the IEEE Global Communications Conference .
  • Benjamin C. M. Fung, K. Wang, R. Chen, and S. Yu Philip. 2010. Privacy-preserving data publishing: A survey on recent developments. ACM Comput. Surv. 42, 4 (2010), 1–53.
  • Sébastien Gambs, Marc-Olivier Killijian, and Miguel Núñez del Prado Cortez. 2010. Show me how you move and i will tell you who you are. In Proceedings of the SIGSPATIAL International Workshop on Security and Privacy in GIS and LBS .
  • Daniel Gayo Avello. 2011. All liaisons are dangerous when all your friends are known to us. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia . ACM, 171–180.
  • Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. In Proceedings of the Annual Conference on the World Wide Web (WWW’13) .
  • Neil Zhenqiang Gong and Bin Liu. 2016. You are who you know and how you behave: Attribute inference attacks via users’ social friends and behaviors. In Proceedings of the USENIX Security Symposium . 979–995.
  • Neil Zhenqiang Gong and Bin Liu. 2018. Attribute inference attacks in online social networks. ACM Trans. Priv. Secur. 21, 1 (2018), 3.
  • Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. 2014. Joint link prediction and attribute inference using a social-attribute network. ACM Trans. Intell. Syst. Technol. 5, 2 (2014), 27.
  • Rachid Guerraoui, Anne-Marie Kermarrec, Rhicheek Patra, and Mahsa Taziki. 2015. D 2 p: Distance-based differential privacy in recommenders. Proc. VLDB Endow. 8, 8 (2015), 862–873.
  • Rachid Guerraoui, Anne-Marie Kermarrec, and Mahsa Taziki. 2017. The utility and privacy effects of a click. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval . ACM.
  • Payas Gupta, Swapna Gottipati, Jing Jiang, and Debin Gao. 2013. Your love is public now: Questioning the use of personal information in authentication. In Proceedings of the ACM Special Interest Group on Security, Audit and Control Conference (SIGSAC’13) . ACM.
  • Alireza Hajibagheri, Gita Sukthankar, Kiran Lakkaraju, Hamidreza Alvari, Rolf T. Wigand, and Nitin Agarwal. 2018. Using massively multiplayer online game data to analyze the dynamics of social interactions. Social Interactions in Virtual Worlds: An Interdisciplinary Perspective (2018).
  • Michael Hay, Gerome Miklau, David Jensen, Don Towsley, and Philipp Weis. 2008. Resisting structural re-identification in anonymized social networks. Proc. VLDB Endow. 1, 1 (2008), 102–114.
  • Jianming He, Wesley W. Chu, and Zhenyu Victor Liu. 2006. Inferring privacy information from social networks. In Proceedings of the International Conference on Intelligence and Security Informatics . Springer, 154–165.
  • Brent Hecht, Lichan Hong, Bongwon Suh, and Ed H. Chi. 2011. Tweets from justin bieber's heart: The dynamics of the location field in user profiles. In Proceedings of the Conference of the Special Interest Group on Computer-Human Interaction (SIGCHI’11) . ACM, 237–246.
  • Shawndra Hill and Foster Provost. 2003. The myth of the double-blind review?: Author identification using only citations. ACM SIGKDD Explor. Newslett. 5, 2 (2003), 179–184.
  • T. Ryan Hoens, Marina Blanton, and Nitesh V. Chawla. 2010. A private and reliable recommendation system for social networks. In Proceedings of the 2010 IEEE Second International Conference on Social Computing (SocialCom’10) . IEEE, 816–825.
  • D. C. Howe and H. Nissenbaum. 2009. TrackMeNot: Resisting surveillance in web search. In Lessons from the Identity Trail: Privacy, Anonymity and Identity in a Networked Society . (Oxford University Press, New York, 2009), 417–436.
  • Jingyu Hua, Chang Xia, and Sheng Zhong. 2015. Differentially private matrix factorization. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’15) .
  • Mathias Humbert, Théophile Studer, Matthias Grossglauser, and Jean-Pierre Hubaux. 2013. Nowhere to hide: Navigating around privacy in online social networks. In Proceedings of the European Symposium on Research in Computer Security .
  • Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of the 30th Annual ACM Symposium on Theory of Computing . ACM, 604–613.
  • P. James. 1992. Knowledge graphs. Linguistic Instruments in Knowledge Engineering (1992), 97--117.
  • Shouling Ji, Weiqing Li, Neil Zhenqiang Gong, Prateek Mittal, and Raheem A. Beyah. 2015. On your social network de-anonymizablity: Quantification and large scale evaluation with seed knowledge. In Proceedings of the Network and Distributed System Security Symposium (NDSS’15) .
  • Shouling Ji, Weiqing Li, Neil Zhenqiang Gong, Prateek Mittal, and Raheem A. Beyah. 2016. Seed based deanonymizability quantification of social networks. IEEE Trans. Inf. Forens. Secur. 11, 7, 1398–1411.
  • Shouling Ji, Weiqing Li, Prateek Mittal, and Raheem Beyah. 2015. SecGraph: A uniform and open-source evaluation system for graph data anonymization and de-anonymization. In Proceedings of the USENIX Security Symposium . 303–318.
  • Shouling Ji, Weiqing Li, Mudhakar Srivatsa, and Raheem Beyah. 2014. Structural data de-anonymization: Quantification, practice, and implications. In Proceedings of the 2014 ACM Special Interest Group on Security, Audit and Control Conference (SIGSAC’14) . ACM, 1040–1053.
  • Shouling Ji, Weiqing Li, Mudhakar Srivatsa, and Raheem Beyah. 2016. Structural data de-anonymization: Theory and practice. IEEE/ACM Trans. Netw. 24, 6 (2016), 3523–3536.
  • Shouling Ji, Weiqing Li, Mudhakar Srivatsa, Jing Selena He, and Raheem Beyah. 2016. General graph data de-anonymization: From mobility traces to social networks. ACM Trans. Intell. Syst. Technol. 18, 4 (2016).
  • Shouling Ji, Prateek Mittal, and Raheem Beyah. 2016. Graph data anonymization, de-anonymization attacks, and de-anonymizability quantification: A survey. IEEE Commun. Surv. Tutor. 19, 2 (2016), 1305–1326.
  • Shouling Ji, Ting Wang, Jianhai Chen, Weiqing Li, Prateek Mittal, and Raheem Beyah. 2017. De-SAG: On the de-anonymization of structure-attribute graph data. IEEE Trans. Depend. Sec. Comput. 16, 4 (2017), 594--607.
  • Jinyuan Jia, Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. AttriInfer: Inferring user attributes in online social networks using markov random fields. In Proceedings of the Annual Conference on the world wide web (WWW’17) . 1561–1569.
  • Zach Jorgensen and Ting Yu. 2014. A privacy-preserving framework for personalized, social recommendations. In Proceedings of the Extended Database Technology Conference (EDBT’14) . 582.
  • David Jurgens. 2013. That's what friends are for: Inferring location in online social media platforms based on social relationships. In Seventh International AAAI Conference on Weblogs and Social Media .
  • David Jurgens, Tyler Finethy, James McCorriston, Yi Tian Xu, and Derek Ruths. 2015. Geolocation prediction in twitter using social networks: A critical analysis and review of current practice. In Ninth International AAAI Conference on Web and Social Media .
  • Daniel Kifer and Ashwin Machanavajjhala. 2011. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data . ACM, 193–204.
  • Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics .
  • Longbo Kong, Zhi Liu, and Yan Huang. 2014. Spot: Locating social media users based on social network context. Proc. VLDB Endow. 7, 13 (2014), 1681–1684.
  • Moshe Koppel, Jonathan Schler, and Shlomo Argamon. 2009. Computational methods in authorship attribution. J. Assoc. Inf. Sci. Technol. 60, 1 (2009), 9–26.
  • Moshe Koppel, Jonathan Schler, and Shlomo Argamon. 2011. Authorship attribution in the wild. Lang. Resourc. Eval. 45, 1 (2011), 83–94.
  • Moshe Koppel, Jonathan Schler, Shlomo Argamon, and Eran Messeri. 2006. Authorship attribution with thousands of candidate authors. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval . ACM, 659–660.
  • Aleksandra Korolova, Rajeev Motwani, Shubha U Nabar, and Ying Xu. 2008. Link privacy in social networks. In Proceedings of the 17th ACM Conference on Information and Knowledge Management . ACM, 289–298.
  • Nitish Korula and Silvio Lattanzi. 2014. An efficient reconciliation algorithm for social networks. Proc. VLDB Endow. 7, 5 (2014), 377–388.
  • Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. U.S.A. 110, 15 (2013), 5802–5805.
  • Harold W. Kuhn. 2010. The hungarian method for the assignment problem. In 50 Years of Integer Programming 1958-2008 . Springer, 29–47.
  • Sebastian Labitzke, Florian Werling, Jens Mittag, and Hannes Hartenstein. 2013. Do online social network friends still threaten my privacy? In Proceedings of the ACM Conference on Data and Application Security and Privacy .
  • Diane Lambert. 1993. Measures of disclosure risk and harm. J. Off. Stat. 9, 2 (1993), 313.
  • Wei-Han Lee, Changchang Liu, Shouling Ji, Prateek Mittal, and Ruby B. Lee. 2017. How to quantify graph de-anonymization risks. In International Conference on Information Systems Security and Privacy . Springer, 84--104.
  • Wei-Han Lee, Changchang Liu, Shouling Ji, Prateek Mittal, and Ruby B. Lee. 2017. Blind de-anonymization attacks using social networks. In Proceedings of the 2017 on Workshop on Privacy in the Electronic Society . ACM, 1–4.
  • Kevin Lewis, Jason Kaufman, Marco Gonzalez, Andreas Wimmer, and Nicholas Christakis. 2008. Tastes, ties, and time: A new social network dataset using facebook. com. Soc. Netw. 30, 4 (2008), 330–342.
  • Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the IEEE 23rd International Conference on Data Engineering 2007 (ICDE’07) . IEEE, 106–115.
  • Rui Li, Shengjie Wang, and Kevin Chen-Chuan Chang. 2012. Multiple location profiling for users and relationships from social network and content. Proc. VLDB Endow. 5, 11 (2012), 1603–1614.
  • Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD’12) .
  • Xiaoxue Li, Yanan Cao, Yanmin Shang, Yanbing Liu, Jianlong Tan, and Li Guo. 2017. Inferring user profiles in online social networks based on convolutional neural network. In Proceedings of the International Conference on Knowledge Science, Engineering and Management . Springer.
  • Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2009. Inferring private information using social network data. In Proceedings of the Annual Conference of the World Wide Web (WWW’09) . ACM, 1145–1146.
  • Bo Liu, Wanlei Zhou, Tianqing Zhu, Longxiang Gao, and Yong Xiang. 2018. Location privacy and its applications: A systematic study. IEEE Access 6 (2018), 17606–17624.
  • Changchang Liu, Supriyo Chakraborty, and Prateek Mittal. 2016. Dependence makes you vulnberable: Differential privacy under dependent tuples. In Proceedings of the Network and Distributed System Security Symposium (NDSS’16) , Vol. 16. 21–24.
  • Changchang Liu and Prateek Mittal. 2016. LinkMirage: Enabling privacy-preserving analytics on social relationships. In Proceedings of the Network and Distributed System Security Symposium (NDSS’16) .
  • Kun Liu and Evimaria Terzi. 2008. Towards identity anonymization on graphs. In Proceedings of the ACM Special Interest Group on Management of Data Conference (SIGMOD’08) .
  • Yushan Liu, Shouling Ji, and Prateek Mittal. 2016. SmartWalk: Enhancing social network security via adaptive random walks. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security . ACM, 492–503.
  • Dixin Luo, Hongteng Xu, Hongyuan Zha, Jun Du, Rong Xie, Xiaokang Yang, and Wenjun Zhang. 2014. You are what you watch and when you watch: Inferring household structures from iptv viewing data. IEEE Trans. Broadcast. 60, 1 (2014), 61–72.
  • Zhifeng Luo and Zhanli Chen. 2014. A privacy preserving group recommender based on cooperative perturbation. In Proceedings of the International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery . IEEE.
  • Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. 2006. l-diversity: Privacy beyond k-anonymity. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’06) . IEEE, 24–24.
  • Ashwin Machanavajjhala, Aleksandra Korolova, and Atish Das Sarma. 2011. Personalized social recommendations: Accurate or private. Proc. VLDB Endow. 4, 7 (2011), 440–450.
  • Nathan Mack, Jasmine Bowers, Henry Williams, Gerry Dozier, and Joseph Shelton. 2015. The best way to a strong defense is a strong offense: Mitigating deanonymization attacks via iterative language translation. Int. J. Mach. Learn. Comput. 5, 5 (2015), 409.
  • Priya Mahadevan, Dmitri Krioukov, Kevin Fall, and Amin Vahdat. 2006. Systematic topology analysis and generation using degree correlations. In Proceedings of the ACM SIGCOMM Computer Communication Review , Vol. 36. ACM, 135–146.
  • Jalal Mahmud, Jeffrey Nichols, and Clemens Drews. 2014. Home location identification of twitter users. ACM Trans. Intell. Syst. Technol. 5, 3 (2014), 47.
  • Huina Mao, Xin Shuai, and Apu Kapadia. 2011. Loose tweets: An analysis of privacy leaks on twitter. In Proceedings of the 10th Annual ACM Workshop on Privacy in the Electronic Society . ACM, 1–12.
  • Jeffrey McGee, James Caverlee, and Zhiyuan Cheng. 2013. Location prediction in social media based on tie strength. In Proceedings of the Conference on Information and Knowledge Management (CIKM’13) . ACM.
  • Jeffrey McGee, James A. Caverlee, and Zhiyuan Cheng. 2011. A geographic study of tie strength in social media. In Proceedings of the Conference on Information and Knowledge Management (CIKM’11) . ACM, 2333–2336.
  • Miller McPherson, Lynn Smith-Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 27, 1 (2001), 415–444.
  • Frank McSherry and Ilya Mironov. 2009. Differentially private recommender systems: Building privacy into the netflix prize contenders. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD’09) . ACM.
  • Frank McSherry and Kunal Talwar. 2007. Mechanism design via differential privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science 2007 (FOCS’07) . IEEE, 94–103.
  • Xuying Meng, Suhang Wang, Kai Shu, Jundong Li, Bo Chen, Huan Liu, and Yujun Zhang. 2018. Personalized privacy-preserving social recommendation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’18) .
  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems . 3111–3119.
  • Tehila Minkus, Yuan Ding, Ratan Dey, and Keith W. Ross. 2015. The city privacy attack: Combining social media and public records for detailed profiles of adults and children. In Proceedings of the ACM Conference on Online Social Networks .
  • Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM’10) . ACM, 251–260.
  • Prateek Mittal, Charalampos Papamanthou, and Dawn Song. 2013. Preserving link privacy in social network based systems. NDSS .
  • Frederick Mosteller and David Wallace. 1964. Inference and Disputed Authorship: The Federalist . Addison-Wesley, Reading, Mass.
  • Mihir Nanavati, Nathan Taylor, William Aiello, and Andrew Warfield. 2011. Herbert west-deanonymizer. In Proceedings of the 6th USENIX Conference on Hot Topics in Security (HotSec'11) . USENIX Association, San Francisco, CA, 6--6.
  • Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Emil Stefanov, Eui Chul Richard Shin, and Dawn Song. 2012. On the feasibility of internet-scale author identification. In Proceedings of the Conference on Security and Privacy (SP’12) . IEEE.
  • Arvind Narayanan, Elaine Shi, and Benjamin IP Rubinstein. 2011. Link prediction by de-anonymization: How we won the kaggle social network challenge. In Proceedings of the International Joint Conference on Neural Networks . IEEE.
  • Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In Proceedings of the Conference on Security and Privacy . IEEE.
  • Arvind Narayanan and Vitaly Shmatikov. 2009. De-anonymizing social networks. In Proceedings of the Conference on Security and Privacy . IEEE.
  • M. E. J. Newman. 2003. The structure and function of complex networks. In SIAM Review , Vol. 45. 167–256.
  • Shirin Nilizadeh, Apu Kapadia, and Yong-Yeol Ahn. 2014. Community-enhanced de-anonymization of online social networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security . ACM, 537–548.
  • Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. 2007. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing . ACM, 75–84.
  • Jahna Otterbacher. 2010. Inferring gender of movie reviewers: Exploiting writing style, content and metadata. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management . ACM, 369–378.
  • Rupa Parameswaran and D. Blough. 2005. A robust data obfuscation approach for privacy preservation of clustered data. In Proceedings of the Workshop on Privacy and Security Aspects of Data Mining . 18–25.
  • Rupa Parameswaran and Douglas M. Blough. 2007. Privacy preserving collaborative filtering using data obfuscation. In Proceedings of the IEEE International Conference on Granular Computing .
  • Javier Parra-Arnau. 2017. Pay-per-tracking: A collaborative masking model for web browsing. Information Sciences 385--386 (2017), 96--124.
  • Javier Parra-Arnau, David Rebollo-Monedero, and Jordi Forné. 2014. Optimal forgery and suppression of ratings for privacy enhancement in recommendation systems. Entropy 16, 3 (2014), 1586–1631.
  • Pedram Pedarsani and Matthias Grossglauser. 2011. On the privacy of anonymized networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1235–1243.
  • Wei Peng, Feng Li, Xukai Zou, and Jie Wu. 2014. A two-stage deanonymization attack against anonymized social networks. IEEE Trans. Comput. 63, 2 (2014), 290–303.
  • Huseyin Polat and Wenliang Du. 2003. Privacy-preserving collaborative filtering using randomized perturbation techniques. In Proceedings of the 3rd IEEE International Conference on Data Mining 2003 (ICDM’03) . IEEE, 625–628.
  • Davide Proserpio, Sharon Goldberg, and Frank McSherry. 2014. Calibrating data to sensitivity in private data analysis: A platform for differentially-private analysis of weighted datasets. Proc. VLDB Endow. 7, 8 (2014).
  • Silvia Puglisi, Javier Parra-Arnau, Jordi Forné, and David Rebollo-Monedero. 2015. On content-based recommendation and user privacy in social-tagging systems. Comput. Stand. Interfaces 41 (2015), 17–27.
  • Jianwei Qian, Xiang-Yang Li, Chunhong Zhang, and Linlin Chen. 2016. De-anonymizing social networks and inferring private attributes using knowledge graphs. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM’16) .
  • Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, and George Karypis. 2001. Privacy risks in recommender systems. IEEE Internet Comput. 5, 6 (2001), 54.
  • Josyula R. Rao, Pankaj Rohatgi, et al. 2000. Can pseudonymity really guarantee privacy? In Proceedings of the USENIX Conference on Security .
  • David Rebollo-Monedero and Jordi Forné. 2010. Optimized query forgery for private information retrieval. IEEE Trans. Inf. Theory 56, 9 (2010), 4631–4642.
  • David Rebollo-Monedero, Javier Parra-Arnau, and Jordi Forné. 2011. An information-theoretic privacy criterion for query forgery in information retrieval. In Proceedings of the International Conference on Security Technology . Springer, 146–154.
  • Shariq J Rizvi and Jayant R Haritsa. 2002. Maintaining data privacy in association rule mining. In Proceedings of the 28th International Conference on Very Large Databases (VLDB’02) . Elsevier, 682–693.
  • Dominic Rout, Kalina Bontcheva, Daniel Preoţiuc-Pietro, and Trevor Cohn. 2013. Where's@ wally?: A classification approach to geolocating users based on their social ties. In Proceedings of the Annual Conference on Hypertext and Social Media . ACM.
  • KyoungMin Ryoo and Sue Moon. 2014. Inferring twitter user locations with 10 km accuracy. In Proceedings of the Annual Conference on the World Wide Web (WWW’14) .
  • Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y Zhao. 2011. Sharing graphs using differentially private graph models. In Proceedings of the ACM SIGCOMM on Internet Measurement Conference .
  • Kumar Sharad. 2016. Change of guard: The next generation of social graph de-anonymization attacks. In Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security . ACM, 105–116.
  • Kumar Sharad and George Danezis. 2014. An automated social graph de-anonymization technique. In Proceedings of the 13th Workshop on Privacy in the Electronic Society . ACM, 47–58.
  • Sanur Sharma, Preeti Gupta, and Vishal Bhatnagar. 2012. Anonymisation in social network: A literature survey and classification. Int. J. Soc. Netw. Min. 1, 1 (2012), 51–66.
  • Yilin Shen and Hongxia Jin. 2014. Privacy-preserving personalized recommendation: An instance-based approach via differential privacy. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14) . IEEE, 540–549.
  • Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux. 2011. Quantifying location privacy. In Proceedings of the 2011 IEEE Symposium on Security and Privacy (SP’11) . IEEE, 247–262.
  • Mudhakar Srivatsa and Mike Hicks. 2012. Deanonymizing mobility traces: Using social network as a side-channel. In Proceedings of the 2012 ACM Conference on Computer and Communications Security . ACM, 628–637.
  • Efstathios Stamatatos. 2009. A survey of modern authorship attribution methods. J. Assoc. Inf. Sci. Technol. 60, 3 (2009), 538–556.
  • Zak Stone, Todd Zickler, and Trevor Darrell. 2008. Autotagging facebook: Social network context improves photo annotation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops .
  • Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncert. Fuzz. Knowl.-Based Syst. 10, 05 (2002), 557–570.
  • Qiang Tang and Jun Wang. 2018. Privacy-preserving friendship-based recommender systems. IEEE Trans. Depend. Sec. Comput. 15, 5 (2018), 784--796.
  • Kurt Thomas, Chris Grier, and David M Nicol. 2010. unfriendly: Multi-party privacy risks in social networks. In Proceedings of the International Symposium on Privacy Enhancing Technologies Symposium . Springer, 236–252.
  • Brian Thompson and Danfeng Yao. 2009. The union-split algorithm and cluster-based anonymization of social networks. In Proceedings of the Symposium on Information, Computer, and Communications Security .
  • Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In Proceedings of the Sixth International Conference on Data Mining (ICDM'06) . IEEE Computer Society, 613--622.
  • Vassilios S. Verykios, Elisa Bertino, Igor Nai Fovino, Loredana Parasiliti Provenza, Yucel Saygin, and Yannis Theodoridis. 2004. State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33, 1 (2004), 50–57.
  • Huandong Wang, Yong Li, Gang Wang, and Depeng Jin. 2018. You are how you move: Linking multiple user identities from massive mobility traces. In Proceedings of the SIAM International Conference on Data Mining (SDM’18) . Society for Industrial and Applied Mathematics.
  • Pengfei Wang, Jiafeng Guo, Yanyan Lan, Jun Xu, and Xueqi Cheng. 2016. Your cart tells you: Inferring demographic attributes from purchase data. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM’16) . ACM.
  • Yue Wang and Xintao Wu. 2013. Preserving differential privacy in degree-correlation based graph generation. Trans. Data Priv. 6, 2 (2013), 127.
  • Udi Weinsberg, Smriti Bhagat, Stratis Ioannidis, and Nina Taft. 2012. BlurMe: Inferring and obfuscating user gender based on ratings. In Proceedings of the 6th ACM Conference on Recommender Systems . ACM, 195–202.
  • Xinyu Wu, Zhongzhao Hu, Xinzhe Fu, Luoyi Fu, Xinbing Wang, and Songwu Lu. 2018. Social network de-anonymization with overlapping communities: Analysis, algorithm and experiments. In Proceeding of the International Conference on Computer Communications (INFOCOM’18) .
  • Qian Xiao, Rui Chen, and Kian-Lee Tan. 2014. Differentially private network data release via structural inference. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 911–920.
  • Yu Xin and Tommi Jaakkola. 2014. Controlling privacy in recommender systems. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’14) .
  • Jaewon Yang and Jure Leskovec. 2013. Overlapping community detection at scale: A nonnegative matrix factorization approach. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining . ACM, 587–596.
  • Lyudmila Yartseva and Matthias Grossglauser. 2013. On the performance of percolation graph matching. In Proceedings of the 1st ACM Conference on Online Social Networks . ACM, 119–130.
  • Zhijun Yin, Manish Gupta, Tim Weninger, and Jiawei Han. 2010. Linkrec: A unified framework for link recommendation with user attributes and graph structure. In Proceedings of the Annual Conference of the World Wide Web (WWW’10) . ACM, 1211–1212.
  • Zhijun Yin, Manish Gupta, Tim Weninger, and Jiawei Han. 2010. A unified framework for link recommendation using random walks. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM’10) . IEEE, 152–159.
  • Xiaowei Ying and Xintao Wu. 2009. Graph generation with prescribed feature constraints. In Proceedings of the SIAM International Conference on Data Mining (SDM’09) .
  • Mingxuan Yuan, Lei Chen, and Philip S. Yu. 2010. Personalized privacy protection in social networks. Proc. VLDB Endow. 4, 2 (2010), 141–150.
  • Aston Zhang, Xing Xie, Carl A. Gunter, Jiawei Han, and XiaoFeng Wang. 2014. Privacy risk in anonymized heterogeneous information networks. In Proceedings of the Extended Database Technology Conference (EDBT’14) .
  • Jinxue Zhang, Jingchao Sun, Rui Zhang, and Yanchao Zhang. 2018. Privacy-preserving social media data outsourcing. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM’18) .
  • Elena Zheleva and Lise Getoor. 2009. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In Proceedings of the 18th International Conference on World Wide Web . ACM, 531–540.
  • Elena Zheleva, Evimaria Terzi, and Lise Getoor. 2012. Privacy in social networks. Synth. Lect. Data Min. Knowl. Discov. 3, 1 (2012), 1–85.
  • X. Zheng, J. Han, and A. Sun. 2018. A survey of location prediction on twitter. IEEE Trans. Knowl. Data Eng. 30, 9 (2018), 1652--1671.
  • Yuan Zhong, Nicholas Jing Yuan, Wen Zhong, Fuzheng Zhang, and Xing Xie. 2015. You are where you go: Inferring demographic attributes from location check-ins. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM’15) . ACM, 295–304.
  • Bin Zhou and Jian Pei. 2008. Preserving privacy in social networks against neighborhood attacks. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’08) .
  • Tianqing Zhu, Gang Li, Yongli Ren, Wanlei Zhou, and Ping Xiong. 2013. Differential privacy for neighborhood-based collaborative filtering. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM’13) . ACM, 752–759.
  • Xue Zhu and Yuqing Sun. 2016. Differential privacy for collaborative filtering recommender algorithm. In Proceedings of the 2016 ACM on International Workshop on Security And Privacy Analytics . ACM, 9–16.
  • Lei Zou, Lei Chen, and M. Tamer Özsu. 2009. K-automorphism: A general framework for privacy preserving network publication. Proc. VLDB Endow. 2, 1 (2009), 946–957.
  • 1 https://bit.ly/1AwePQE .
  • 2 http://www.microsoft.com/protect/yourself/phishing/spear.mspx .
  • 3 http://www.dmoz.com .

This material is based upon the work supported in part by Army Research Office (ARO) under grant number W911NF-15-1-0328 and Office of Naval Research (ONR) under grant number N00014-17-1-2605.

Authors’ address: G. Beigi and H. Liu, School of Computing, Informatics, and Decision Systems Engineering, Ira A. Fulton Schools of Engineering, Arizona State Univesity, Tempe, AZ, USA; emails: [email protected] , [email protected] .

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected] .

©2020 Association for Computing Machinery. 2577-3224/2020/01-ART7 $15.00 DOI: https://doi.org/10.1145/3343038

Publication History: Received July 2018; revised January 2019; accepted April 2019

  • Skip to Content
  • Skip to Main Navigation
  • Skip to Search

privacy media case study

Indiana University Bloomington Indiana University Bloomington IU Bloomington

Open Search

  • “Ad”mission of guilt
  • “Do I stop him?”
  • Newspaper joins war against drugs
  • Have I got a deal for you!
  • Identifying what’s right
  • Is “Enough!” too much?
  • Issues of bench and bar
  • Knowing when to say “when!”
  • Stop! This is a warning…
  • Strange bedfellows
  • Gambling with being first
  • Making the right ethical choice can mean winning by losing
  • Playing into a hoaxster’s hands
  • “They said it first”
  • Is it news, ad or informercial?
  • Letter to the editor
  • Games publishers play
  • An offer you can refuse
  • An oily gift horse
  • Public service . . . or “news-mercials”
  • As life passes by
  • Bringing death close
  • A careless step, a rash of calls
  • Distortion of reality?
  • Of life and death
  • Naked came the rider
  • “A photo that had to be used”
  • A picture of controversy
  • Freedom of political expression
  • Brother, can you spare some time?
  • Columnist’s crusade OK with Seattle
  • Kiss and tell
  • The making of a govenor
  • Past but not over
  • Of publishers and politics
  • To tell the truth
  • Truth & Consequences
  • “Truth boxes”
  • When journalists become flacks
  • A book for all journalists who believe
  • The Billboard Bandit
  • Food for thought
  • Grand jury probe
  • Judgement on journalists
  • Lessons from an ancient spirit
  • Lying for the story . . .
  • Newspaper nabs Atlanta’s Dahmer
  • One way to a good end
  • Over the fence
  • “Psst! Pass it on!”
  • Rules aren’t neat on Crack Street
  • “Someone had to be her advocate”
  • Trial by Fire
  • Trial by proximity
  • Using deceit to get the truth
  • When advocacy is okay
  • Witness to an execution
  • Are we our brother’s keeper? . . . You bet we are!
  • Betraying a trust
  • Broken promise
  • “But I thought you were . . . ”
  • “Can I take it back?”
  • Competitive disadvantage
  • Getting it on tape
  • The great quote question
  • How to handle suicide threats
  • Let’s make a deal!
  • A phone-y issue?
  • The source wanted out
  • The story that died in a lie
  • Thou shalt not break thy promise
  • Thou shalt not concoct thy quote
  • Thou shalt not trick thy source
  • Too good to be true
  • Vulnerable sources and journalistic responsibility
  • The way things used to be . . .
  • When a story just isn’t worth it
  • When a story source threatens suicide

When public should remain private

  • The ethics of “outing”
  • “For personal reasons”
  • Intruding on grief
  • Intruding on private pain

Privacy case settled against TV station

  • Seeing both sides
  • Two views on “outing”
  • Unwanted spotlight
  • Whose right is it anyway?

Other views on the Christine Busalacchi case

  • The death of a soldier
  • Firing at Round Rock
  • A kinder, gentler news media
  • Operation: Buy yourself a parade
  • Rallying ’round the flag
  • “Salute to military” ads canceled
  • Tell the truth, stay alive
  • The windbags of war
  • Absent with no malice
  • Anonymity for rape victims . . .
  • An exception to the rule
  • The boy with a broken heart
  • Civilly suitable
  • Creating a victim
  • “Everyone already knew”
  • An exceptional case
  • Innocent victims
  • Minor infraction
  • Names make news
  • Naming a victim
  • Naming “johns”
  • Profile of controversy
  • What the media all missed
  • Punishing plagiarizers
  • Sounding an alarm on AIDS
  • Suffer the children
  • Anchor’s away
  • The day the earth stood still
  • Doing your own ethics audit
  • Good guys, bad guys and TV news
  • Is it just me, or . . . ?
  • The Post’s exam answer story
  • TV station “teases” suicide
  • Yanking Doonesbury
  • The year in review
  • Colorado media’s option play
  • Deadly lesson
  • Deciding which critically ill person gets coverage
  • When journalists play God . . .
  • A delicate balance
  • The Fallen Servant
  • Handle with care
  • It’s the principle, really
  • Killing news
  • Maybe what seems so right is wrong
  • On the line
  • Protest and apology after Daily Beacon story
  • Red flag for badgering
  • Sharing the community’s grief
  • The “super-crip” stereotype
  • “And then he said *&%*!!!”
  • When big is not better
  • When the KKK comes calling
  • Not the straight story
  • Agreeing to disagree
  • All in the family
  • Family feud
  • Author! Author!
  • The Bee that roared
  • Brewing controversy
  • Building barriers
  • Other views from librarians
  • The ethics of information selling
  • Close to home
  • Family ties
  • How now, sacred cow?
  • The ties that bind
  • “Like any other story”
  • When your newspaper is the news
  • Not friendly fire
  • Overdraft on credibility?
  • The problem is the writing
  • Written rules can be hazardous
  • Project censored, sins of omission and the hardest “W” of all – “why”
  • Risking the newsroom’s image
  • The Media School

Ethics Case Studies

  • Browse Ethics Case Studies

Invading Privacy

The ethics of “outing”: Breaking the silence code on homosexuality

“For personal reasons”: Balancing privacy with the right to know

Intruding on grief: Does the public really have a “need to know?”

Intruding on private pain: Emotional TV segment offers hard choice

Seeing both sides: A personal and professional dilemma

Two views on “outing”: When the media do it for you

Two views on “outing”: When you do it yourself

Unwanted Spotlight: When private people become part of a public story

Whose right is it anyway?: Videotape of accident victim raises questions about rights to privacy

Ethics Case Studies resources and social media channels

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

  • Teens, Social Media, and Privacy

Table of Contents

  • Acknowledgements
  • Introduction
  • Part 1: Teens and Social Media Use
  • Part 2: Information Sharing, Friending, and Privacy Settings on Social Media
  • Part 3: Reputation Management on Social Media
  • Part 4: Putting Privacy Practices in Context: A Portrait of Teens’ Experiences Online

Teens share a wide range of information about themselves on social media sites; 1 indeed the sites themselves are designed to encourage the sharing of information and the expansion of networks. However, few teens embrace a fully public approach to social media. Instead, they take an array of steps to restrict and prune their profiles, and their patterns of reputation management on social media vary greatly according to their gender and network size. These are among the key findings from a new report based on a survey of 802 teens that examines teens’ privacy management on social media sites:

  • Teens are sharing more information about themselves on social media sites than they did in the past. For the five different types of personal information that we measured in both 2006 and 2012, each is significantly more likely to be shared by teen social media users in our most recent survey.

Teen Twitter use has grown significantly: 24% of online teens use Twitter, up from 16% in 2011.

The typical (median) teen facebook user has 300 friends, while the typical teen twitter user has 79 followers..

  • Focus group discussions with teens show that they have waning enthusiasm for Facebook, disliking the increasing adult presence, people sharing excessively, and stressful “drama,” but they keep using it because participation is an important part of overall teenage socializing.

60% of teen Facebook users keep their profiles private, and most report high levels of confidence in their ability to manage their settings.

Teens take other steps to shape their reputation, manage their networks, and mask information they don’t want others to know; 74% of teen social media users have deleted people from their network or friends list., teen social media users do not express a high level of concern about third-party access to their data; just 9% say they are “very” concerned., on facebook, increasing network size goes hand in hand with network variety, information sharing, and personal information management..

  • In broad measures of online experience, teens are considerably more likely to report positive experiences than negative ones. For instance, 52% of online teens say they have had an experience online that made them feel good about themselves.

Teens are sharing more information about themselves on social media sites than they did in the past.

Teens are increasingly sharing personal information on social media sites, a trend that is likely driven by the evolution of the platforms teens use as well as changing norms around sharing. A typical teen’s MySpace profile from 2006 was quite different in form and function from the 2006 version of Facebook as well as the Facebook profiles that have become a hallmark of teenage life today. For the five different types of personal information that we measured in both 2006 and 2012, each is significantly more likely to be shared by teen social media users on the profile they use most often.

  • 91% post a photo of themselves , up from 79% in 2006.
  • 71% post their school name , up from 49%.
  • 71% post the city or town where they live , up from 61%.
  • 53% post their email address , up from 29%.
  • 20% post their cell phone number , up from 2%.

In addition to the trend questions, we also asked five new questions about the profile teens use most often and found that among teen social media users:

  • 92% post their real name to the profile they use most often. 2
  • 84% post their interests , such as movies, music, or books they like.
  • 82% post their birth date .
  • 62% post their relationship status .
  • 24% post videos of themselves .

Figure 1 teens and social media

Older teens are more likely than younger teens to share certain types of information, but boys and girls tend to post the same kind of content.

Generally speaking, older teen social media users (ages 14-17), are more likely to share certain types of information on the profile they use most often when compared with younger teens (ages 12-13).

Older teens who are social media users more frequently share:

  • Photos of themselves on their profile (94% older teens vs. 82% of younger teens)
  • Their school name (76% vs. 56%)
  • Their relationship status (66% vs. 50%)
  • Their cell phone number (23% vs. 11%)

While boys and girls generally share personal information on social media profiles at the same rates, cell phone numbers are a key exception.  Boys are significantly more likely to share their numbers than girls (26% vs. 14%). This is a difference that is driven by older boys. Various differences between white and African-American social media-using teens are also significant, with the most notable being the lower likelihood that African-American teens will disclose their real names on a social media profile (95% of white social media-using teens do this vs. 77% of African-American teens). 3

16% of teen social media users have set up their profile to automatically include their location in posts.

Beyond basic profile information, some teens choose to enable the automatic inclusion of location information when they post. Some 16% of teen social media users said they set up their profile or account so that it automatically includes their location in posts. Boys and girls and teens of all ages and socioeconomic backgrounds are equally likely to say that they have set up their profile to include their location when they post. Focus group data suggests that many teens find sharing their location unnecessary and unsafe, while others appreciate the opportunity to signal their location to friends and parents.

Twitter draws a far smaller crowd than Facebook for teens, but its use is rising. One in four online teens uses Twitter in some way. While overall use of social networking sites among teens has hovered around 80%, Twitter grew in popularity; 24% of online teens use Twitter, up from 16% in 2011 and 8% the first time we asked this question in late 2009.

African-American teens are substantially more likely to report using Twitter when compared with white youth.

Continuing a pattern established early in the life of Twitter, African-American teens who are internet users are more likely to use the site when compared with their white counterparts. Two in five (39%) African-American teens use Twitter, while 23% of white teens use the service.

Public accounts are the norm for teen Twitter users.

While those with Facebook profiles most often choose private settings, Twitter users, by contrast, are much more likely to have a public account.

  • 64% of teens with Twitter accounts say that their tweets are public, while 24% say their tweets are private.
  • 12% of teens with Twitter accounts say that they “don’t know” if their tweets are public or private.
  • While boys and girls are equally likely to say their accounts are public, boys are significantly more likely than girls to say that they don’t know (21% of boys who have Twitter accounts report this, compared with 5% of girls).

Overall, teens have far fewer followers on Twitter when compared with Facebook friends; the typical (median) teen Facebook user has 300 friends, while the typical (median) teen Twitter user has 79 followers. Girls and older teens tend to have substantially larger Facebook friend networks compared with boys and younger teens.

Teens’ Facebook friendship networks largely mirror their offline networks. Seven in ten say they are friends with their parents on Facebook.

Teens, like other Facebook users, have different kinds of people in their online social networks. And how teens construct that network has implications for who can see the material they share in those digital social spaces:

  • 98% of Facebook-using teens are friends with people they know from school.
  • 91% of teen Facebook users are friends with members of their extended family.
  • 89% are connected to friends who do not attend the same school.
  • 76% are Facebook friends with brothers and sisters.
  • 70% are Facebook friends with their parents.
  • 33% are Facebook friends with other people they have not met in person.
  • 30% have teachers or coaches as friends in their network.
  • 30% have celebrities, musicians or athletes in their network.

Older teens tend to be Facebook friends with a larger variety of people, while younger teens are less likely to friend certain groups, including those they have never met in person.

Older teens are more likely than younger ones to have created broader friend networks on Facebook. Older teens (14-17) who use Facebook are more likely than younger teens (12-13) to be connected with:

  • Friends who go to different schools (92% vs. 82%)
  • People they have never met in person, not including celebrities (36% vs. 25%)
  • Teachers or coaches (34% vs. 19%)

Girls are also more likely than boys (37% vs. 23%) to be Facebook friends with coaches or teachers, the only category of Facebook friends where boys and girls differ.

African-American youth are nearly twice as likely as whites to be Facebook friends with celebrities, athletes, or musicians (48% vs. 25%).

Focus group discussions with teens show that they have waning enthusiasm for Facebook.

In focus groups, many teens expressed waning enthusiasm for Facebook. They dislike the increasing number of adults on the site, get annoyed when their Facebook friends share inane details, and are drained by the “drama” that they described as happening frequently on the site. The stress of needing to manage their reputation on Facebook also contributes to the lack of enthusiasm. Nevertheless, the site is still where a large amount of socializing takes place, and teens feel they need to stay on Facebook in order to not miss out.

Users of sites other than Facebook express greater enthusiasm for their choice.

Those teens who used sites like Twitter and Instagram reported feeling like they could better express themselves on these platforms, where they felt freed from the social expectations and constraints of Facebook. Some teens may migrate their activity and attention to other sites to escape the drama and pressures they find on Facebook, although most still remain active on Facebook as well.

Teens have a variety of ways to make available or limit access to their personal information on social media sites. Privacy settings are one of many tools in a teen’s personal data management arsenal. Among teen Facebook users, most choose private settings that allow only approved friends to view the content that they post.

Most keep their Facebook profile private. Girls are more likely than boys to restrict access to their profiles.

Some 60% of teens ages 12-17 who use Facebook say they have their profile set to private, so that only their friends can see it. Another 25% have a partially private profile, set so that friends of their friends can see what they post. And 14% of teens say that their profile is completely public. 4

  • Girls who use Facebook are substantially more likely than boys to have a private (friends only) profile (70% vs. 50%).
  • By contrast, boys are more likely than girls to have a fully public profile that everyone can see (20% vs. 8%).

Most teens express a high level of confidence in managing their Facebook privacy settings.

More than half (56%) of teen Facebook users say it’s “not difficult at all” to manage the privacy controls on their Facebook profile, while one in three (33%) say it’s “not too difficult.” Just 8% of teen Facebook users say that managing their privacy controls is “somewhat difficult,” while less than 1% describe the process as “very difficult.”

Teens’ feelings of efficacy increase with age:

  • 41% of Facebook users ages 12-13 say it is “not difficult at all” to manage their privacy controls, compared with 61% of users ages 14-17.
  • Boys and girls report similar levels of confidence in managing the privacy controls on their Facebook profile.

For most teen Facebook users, all friends and parents see the same information and updates on their profile.

Beyond general privacy settings, teen Facebook users have the option to place further limits on who can see the information and updates they post. However, few choose to customize in that way: Among teens who have a Facebook account, only 18% say that they limit what certain friends can see on their profile. The vast majority (81%) say that all of their friends see the same thing on their profile. 5 This approach also extends to parents; only 5% of teen Facebook users say they limit what their parents can see.

Teens are cognizant of their online reputations, and take steps to curate the content and appearance of their social media presence. For many teens who were interviewed in focus groups for this report, Facebook was seen as an extension of offline interactions and the social negotiation and maneuvering inherent to teenage life. “Likes” specifically seem to be a strong proxy for social status, such that teen Facebook users will manipulate their profile and timeline content in order to garner the maximum number of “likes,” and remove photos with too few “likes.”

Pruning and revising profile content is an important part of teens’ online identity management.

Teen management of their profiles can take a variety of forms – we asked teen social media users about five specific activities that relate to the content they post and found that:

  • 59% have deleted or edited something that they posted in the past.
  • 53% have deleted comments from others on their profile or account.
  • 45% have removed their name from photos that have been tagged to identify them.
  • 31% have deleted or deactivated an entire profile or account.
  • 19% have posted updates, comments, photos, or videos that they later regretted sharing.

74% of teen social media users have deleted people from their network or friends’ list; 58% have blocked people on social media sites.

Given the size and composition of teens’ networks, friend curation is also an integral part of privacy and reputation management for social media-using teens. The practice of friending, unfriending, and blocking serve as privacy management techniques for controlling who sees what and when. Among teen social media users:

  • Girls are more likely than boys to delete friends from their network (82% vs. 66%) and block people (67% vs. 48%).
  • Unfriending and blocking are equally common among teens of all ages and across all socioeconomic groups.
  • 58% of teen social media users say they share inside jokes or cloak their messages in some way.

As a way of creating a different sort of privacy, many teen social media users will obscure some of their updates and posts, sharing inside jokes and other coded messages that only certain friends will understand:

  • Older teens are considerably more likely than younger teens to say that they share inside jokes and coded messages that only some of their friends understand (62% vs. 46%).

26% say that they post false information like a fake name, age, or location to help protect their privacy.

One in four (26%) teen social media users say that they post fake information like a fake name, age or location to help protect their privacy.

  • African-American teens who use social media are more likely than white teens to say that they post fake information to their profiles (39% vs. 21%).

Overall, 40% of teen social media users say they are “very” or “somewhat” concerned that some of the information they share on social networking sites might be accessed by third parties like advertisers or businesses without their knowledge. However, few report a high level of concern; 31% say that they are “somewhat” concerned, while just 9% say that they are “very” concerned. 6 Another 60% in total report that they are “not too” concerned (38%) or “not at all” concerned (22%).

  • Younger teen social media users (12-13) are considerably more likely than older teens (14-17) to say that they are “very concerned” about third party access to the information they share (17% vs. 6%).

Insights from our focus groups suggest that some teens may not have a good sense of whether the information they share on a social media site is being used by third parties.

Parents, by contrast, express high levels of concern about how much information advertisers can learn about their children’s behavior online..

Parents of the surveyed teens were asked a related question: “How concerned are you about how much information advertisers can learn about your child’s online behavior?” A full 81% of parents report being “very” or “somewhat” concerned, with 46% reporting they are “very concerned.”  Just 19% report that they are not too concerned or not at all concerned about how much advertisers could learn about their child’s online activities.

Teens who are concerned about third party access to their personal information are also more likely to engage in online reputation management.

Teens who are somewhat or very concerned that some of the information they share on social network sites might be accessed by third parties like advertisers or businesses without their knowledge more frequently delete comments, untag themselves from photos or content, and deactivate or delete their entire account.  Among teen social media users, those who are “very” or “somewhat” concerned about third party access are more likely than less concerned teens to:

  • Delete comments that others have made on their profile (61% vs. 49%).
  • Untag themselves in photos (52% vs. 41%).
  • Delete or deactivate their profile or account (38% vs. 25%).
  • Post updates, comments, photos or videos that they later regret (26% vs. 14%).

Teens with larger Facebook networks are more frequent users of social networking sites and tend to have a greater variety of people in their friend networks. They also share a wider range of information on their profile when compared with those who have a smaller number of friends on the site. Yet even as they share more information with a wider range of people, they are also more actively engaged in maintaining their online profile or persona.

Teens with large Facebook friend networks are more frequent social media users and participate on a wider diversity of platforms in addition to Facebook.

Teens with larger Facebook networks are fervent social media users who exhibit a greater tendency to “diversify” their platform portfolio:

  • 65% of teens with more than 600 friends on Facebook say that they visit social networking sites several times a day, compared with 27% of teens with 150 or fewer Facebook friends.
  • Teens with more than 600 Facebook friends are more than three times as likely to also have a Twitter account when compared with those who have 150 or fewer Facebook friends (46% vs. 13%). They are six times as likely to use Instagram (12% vs. 2%).

Teens with larger Facebook networks tend to have more variety within those networks.

Almost all Facebook users (regardless of network size) are friends with their schoolmates and extended family members. However, other types of people begin to appear as the size of teens’ Facebook networks expand:

  • Teen Facebook users with more than 600 friends in their network are much more likely than those with smaller networks to be Facebook friends with peers who don’t attend their own school, with people they have never met in person (not including celebrities and other “public figures”), as well as with teachers or coaches.
  • On the other hand, teens with the largest friend networks are actually less likely to be friends with their parents on Facebook when compared with those with the smallest networks (79% vs. 60%).

Teens with large networks share a wider range of content, but are also more active in profile pruning and reputation management activities.

Teens with the largest networks (more than 600 friends) are more likely to include a photo of themselves, their school name, their relationship status, and their cell phone number on their profile when compared with teens who have a relatively small number of friends in their network (under 150 friends). However, teens with large friend networks are also more active reputation managers on social media.

  • Teens with larger friend networks are more likely than those with smaller networks to block other users, to delete people from their friend network entirely, to untag photos of themselves, or to delete comments others have made on their profile.
  • They are also substantially more likely to automatically include their location in updates and share inside jokes or coded messages with others.

In broad measures of online experience, teens are considerably more likely to report positive experiences than negative ones.

In the current survey, we wanted to understand the broader context of teens’ online lives beyond Facebook and Twitter. A majority of teens report positive experiences online, such as making friends and feeling closer to another person, but some do encounter unwanted content and contact from others.

  • 52% of online teens say they have had an experience online that made them feel good about themselves. Among teen social media users, 57% said they had an experience online that made them feel good, compared with 30% of teen internet users who do not use social media.
  • One in three online teens (33%) say they have had an experience online that made them feel closer to another person. Looking at teen social media users, 37% report having an experience somewhere online that made them feel closer to another person, compared with just 16% of online teens who do not use social media.

One in six online teens say they have been contacted online by someone they did not know in a way that made them feel scared or uncomfortable.

Unwanted contact from strangers is relatively uncommon, but 17% of online teens report some kind of contact that made them feel scared or uncomfortable. 7 Online girls are more than twice as likely as boys to report contact from someone they did not know that made them feel scared or uncomfortable (24% vs. 10%).

Few internet-using teens have posted something online that caused problems for them or a family member, or got them in trouble at school.

A small percentage of teens have engaged in online activities that had negative repercussions for them or their family; 4% of online teens say they have shared sensitive information online that later caused a problem for themselves or other members of their family. Another 4% have posted information online that got them in trouble at school.

More than half of internet-using teens have decided not to post content online over reputation concerns.

More than half of online teens (57%) say they have decided not to post something online because they were concerned it would reflect badly on them in the future. Teen social media users are more likely than other online teens who do not use social media to say they have refrained from sharing content due to reputation concerns (61% vs. 39%).

Large numbers of youth have lied about their age in order to gain access to websites and online accounts.

In 2011, we reported that close to half of online teens (44%) admitted to lying about their age at one time or another so they could access a website or sign up for an online account. In the latest survey, 39% of online teens admitted to falsifying their age in order gain access to a website or account, a finding that is not significantly different from the previous survey.

Close to one in three online teens say they have received online advertising that was clearly inappropriate for their age.

Exposure to inappropriate advertising online is one of the many risks that parents, youth advocates, and policy makers are concerned about. Yet, little has been known until now about how often teens encounter online ads that they feel are intended for more (or less) mature audiences. In the latest survey, 30% of online teens say they have received online advertising that is “clearly inappropriate” for their age.

About the survey and focus groups

These findings are based on a nationally representative phone survey run by the Pew Research Center’s Internet & American Life Project of 802 parents and their 802 teens ages 12-17. It was conducted between July 26 and September 30, 2012. Interviews were conducted in English and Spanish and on landline and cell phones. The margin of error for the full sample is ± 4.5 percentage points.

This report marries that data with insights and quotes from in-person focus groups conducted by the Youth and Media team at the Berkman Center for Internet & Society at Harvard University beginning in February 2013. The focus groups focused on privacy and digital media, with special emphasis on social media sites. The team conducted 24 focus group interviews with 156 students across the greater Boston area, Los Angeles (California), Santa Barbara (California), and Greensboro (North Carolina). Each focus group lasted 90 minutes, including a 15-minute questionnaire completed prior to starting the interview, consisting of 20 multiple-choice questions and 1 open-ended response. Although the research sample was not designed to constitute representative cross-sections of particular population(s), the sample includes participants from diverse ethnic, racial, and economic backgrounds. Participants ranged in age from 11 to 19. The mean age of participants is 14.5.

In addition, two online focus groups of teenagers ages 12-17 were conducted by the Pew Internet Project from June 20-27, 2012 to help inform the survey design. The first focus group was with 11 middle schoolers ages 12-14, and the second group was with nine high schoolers ages 14-17. Each group was mixed gender, with some racial, socio-economic, and regional diversity. The groups were conducted as an asynchronous threaded discussion over three days using an online platform and the participants were asked to log in twice per day.

Throughout this report, this focus group material is highlighted in several ways. Pew’s online focus group quotes are interspersed with relevant statistics from the survey in order to illustrate findings that were echoed in the focus groups or to provide additional context to the data. In addition, at several points, there are extensive excerpts boxed off as standalone text boxes that elaborate on a number of important themes that emerged from the in-person focus groups conducted by the Berkman Center.

  • We use “social media site” as the umbrella term that refers to social networking sites (like Facebook, LinkedIn, and Google Plus) as well as to information- and media-sharing sites that users may not think of in terms of networking such as Twitter, Instagram, and Tumblr. “Teen social media users” are teens who use any social media site(s). When we use “social networking sites” or “social networking sites and Twitter,” it will be to maintain the original wording when reporting survey results. ↩
  • Given that Facebook is now the dominant platform for teens, and a first and last name is required when creating an account, this is undoubtedly driving the nearly universal trend among teen social media users to say they post their real name to the profile they use most often. Fake accounts with fake names can still be created on Facebook, but the practice is explicitly forbidden in Facebook’s Terms of Service. ↩
  • The sample size for African-American teens who use social media is relatively small (n=95), but all differences between white and African-American teen social media users noted throughout this section are statistically significant. ↩
  • In 2011, the privacy settings question was asked of all teen SNS or Twitter users, prompting them to think about the “profile they use most often.” Among this group 62% reported having a private profile, 19% said their profile was partially private, and 17% said their profile was public. At the time, almost all of these teen social media users (93%) said they had a Facebook account, but some respondents could have been reporting settings for other platforms. ↩
  • This behavior is consistent, regardless of the general privacy settings on a teen’s profile. ↩
  • Recent research has described a “control paradox” that may influence user behavior and attitudes toward information disclosures online. In spaces where users feel they have control over the publication of their private information, they may “give less importance to control (or lack thereof) of the accessibility and use of that information by others.” See, Laura Brandimarte, et al.: “Misplaced Confidences: Privacy and the Control Paradox.” ↩
  • This question does not reference sexual solicitations and could include an array of contact that made the teen feel scared or uncomfortable. ↩

Sign up for our weekly newsletter

Fresh data delivery Saturday mornings

Sign up for The Briefing

Weekly updates on the world of news & information

  • Online Privacy & Security
  • Platforms & Services
  • Privacy Rights
  • Social Media
  • Teens & Tech
  • Teens & Youth

How Americans Navigate Politics on TikTok, X, Facebook and Instagram

How americans get news on tiktok, x, facebook and instagram, whatsapp and facebook dominate the social media landscape in middle-income nations, 5 facts about how americans use facebook, two decades after its launch, americans’ social media use, most popular, report materials.

  • Interactive : How Teens Share Information on Social Media
  • Interactive : Teens on Facebook: What They Share with Friends
  • Infographic : Teens, Social Media, and Privacy
  • Infographic : What Teens Share on Social Media
  • Focus group highlights : What teens said about social media, privacy, and online identity
  • July 26-Sept. 30, 2012 – Teens and Online Privacy

901 E St. NW, Suite 300 Washington, DC 20004 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan, nonadvocacy fact tank that informs the public about the issues, attitudes and trends shaping the world. It does not take policy positions. The Center conducts public opinion polling, demographic research, computational social science research and other data-driven research. Pew Research Center is a subsidiary of The Pew Charitable Trusts , its primary funder.

© 2024 Pew Research Center

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychol

Research on the influence mechanism of privacy invasion experiences with privacy protection intentions in social media contexts: Regulatory focus as the moderator

1 School of Journalism and Communication, Xiamen University, Xiamen, China

2 Research Center for Intelligent Society and Social Governance, Interdisciplinary Research Institute, Zhejiang Lab, Hangzhou, China

Associated Data

The original contributions presented in the study are included in the article/ Supplementary material , further inquiries can be directed to the corresponding author.

Introduction

In recent years, there have been numerous online privacy violation incidents caused by the leakage of personal information of social media users, yet there seems to be a tendency for users to burn out when it comes to privacy protection, which leads to more privacy invasions and forms a vicious circle. Few studies have examined the impact of social media users' privacy invasion experiences on their privacy protection intention. Protection motivation theory has often been applied to privacy protection research. However, it has been suggested that the theory could be improved by introducing individual emotional factors, and empirical research in this area is lacking.

To fill these gaps, the current study constructs a moderated chain mediation model based on protection motivation theory and regulatory focus theory, and introduces privacy fatigue as an emotional variable.

Results and discussion

An analysis of a sample of 4800 from China finds that: (1) Social media users' previous privacy invasion experiences can increase their privacy protection intention. This process is mediated by response costs and privacy fatigue. (2) Privacy fatigue plays a masking effect, i.e., increased privacy invasion experiences and response costs will raise individuals' privacy fatigue, and the feeling of privacy fatigue significantly reduces individuals' willingness to protect their privacy. (3) Promotion-focus individuals are less likely to experience privacy fatigue than those with prevention-focus. In summary, this trend of “lie flat” on social media users' privacy protection is caused by the key factor of “privacy fatigue”, and the psychological trait of regulatory focus can be used to interfere with the development of privacy fatigue. This study extends the scope of research on privacy protection and regulatory focus theory, refines the theory of protection motivation, and expands the empirical study of privacy fatigue; the findings also inform the practical governance of social network privacy.

1. Introduction

Nowadays, people communicate and share information through SNS, and it has become an integral part of the daily lives of network users worldwide (Hsu et al., 2013 ). SNS makes people's lives highly convenient. However, it also poses an increasingly serious privacy issue. For instance, British media reported that 87,000,000 Facebook users' profiles were illegally leaked to a political consulting firm, Cambridge Analytica (Revell, 2019 ). In addition, one of the three major US credit bureaus, Equifax, reported a large-scale data leak in 2017, including 146 million pieces of personal information (Zhou and Schaub, 2018 ). The incidents that happened in recent years provoked a wave of discussion on personal privacy and information security issues.

Individuals' proactive behavior in protecting online privacy information is an effective method for reducing the occurrence of privacy violations; therefore, scholars explored how to enhance individuals' willingness to protect privacy. In terms of applied theoretical models, the Health Belief Model (HBM) (Kisekka and Giboney, 2018 ), the Technology Threat Avoidance Theory (TTAT) (McLeod and Dolezel, 2022 ), the Technology Acceptance Model (TAM) (Baby and Kannammal, 2020 ), and the Theory of Planned Behavior (TPB) (Xu et al., 2013 ) have been applied to explore the issue of online privacy protection behavior. By contrast, Protection Motivation Theory (PMT) is more applicable to studying privacy protection behavior in SNS because it focuses on threat assessment and coping mechanisms for privacy issues. However, the issue with this study's application of PMT theory is that it ignores the influence of individual emotions on protective behavior (Mousavi et al., 2020 ). Therefore, this study considered privacy fatigue as a variable to expand the theory of PMT in the context of social media privacy protection research. Moreover, in terms of the antecedents of privacy protection, existing research suggests that factors such as perceived benefits, perceived risks (Price et al., 2005 ), privacy concerns (Youn and Kim, 2019 ), self-efficacy (Baruh et al., 2017 ), and trust (Wang et al., 2017 ) can affect individuals' privacy-protective behaviors.

Along with the increased frequency of data breaches on the Internet, people find that they have less control over their data. Further, they are overwhelmed by having to protect their privacy alone. Moreover, the complexity of the measures required to protect personal information aggravates users' sense of futility, leading to exhaustion among online users. This phenomenon, defined as “privacy fatigue,” is regarded as a factor leading to the avoidance of privacy issues. Privacy fatigue has recently been prevalent among network users. However, empirical studies related to this phenomenon are still insufficient (Choi et al., 2018 ). Therefore, this study attempted to explore the role privacy burnout plays in users' privacy protection behaviors. Previous studies discovered that the impact of varying degrees of privacy invasion on privacy protection differed according to individual differences. It could be moderated by psychological differences (Lai and Hui, 2006 ). Clarifying the role of psychological traits is beneficial to the hierarchical governance of privacy protection. Regulatory focus is a kind of psychological trait based on different regulatory orientations, which could effectively affect social media users' behavioral preferences and decisions on privacy protection (Cho et al., 2019 ); however, to date, the relationship between regulatory focus, privacy fatigue, and privacy protection intentions has not been sufficiently examined. For this reason, it is necessary to empirically explore this question.

Based on the PMT theoretical framework, this study built a moderated mediation model to examine the influential mechanism of privacy-invasive experiences on privacy protection intentions by introducing three factors: response costs, privacy burnout, and regulatory focus. Data analyzed from an online survey of 4,800 network users demonstrated that, first, social media users' experiences of privacy invasion increase their willingness to protect privacy. Second, privacy burnout has a masking effect, which means that the more privacy-invasive experiences and response costs there are, the greater the privacy fatigue, which reduces users' privacy protection intentions even further. Third, promotion-focused individuals are less likely to experience fatigue from protecting personal information alone. The significance of this study lies in the fact that it bridged the gap between the effects of privacy violation experiences on individuals' protective willingness.

Meanwhile, this study verified the practicality of combining PMT theory with emotionally related variables. Additionally, it complemented the study on privacy fatigue and expanded the scope of regulatory orientation theory in privacy research. From a practical perspective, this study offered a reference for the hierarchical governance of privacy in social networks. Finally, this study reveals a vicious cycle mechanism (negative experiences, privacy fatigue, low willingness to protect, and new negative experiences) followed by a theoretical reference for breaking this cycle.

2. Theoretical framework

2.1. privacy invasion experiences, response costs, and privacy protection intentions.

Protection motivation theory (PMT) is commonly used in online privacy studies (Chen et al., 2015 ). According to Rogers ( 1975 ), individuals cognitively evaluate the risk before adopting behaviors, develop protection motivation, and eventually modify their behaviors to avoid risks. There are two sources of impact on people's response assessments: environmental and interpersonal sources of information and prior experience. After combing through the past literature, we found that many scholars have verified the influence of environmental (Wu et al., 2019 ) and interpersonal (Hsu et al., 2013 ) factors on individual privacy protection; however, only a few scholars explored the effect of privacy violation experiences on privacy protection intentions. Some studies proved that individuals' prior privacy violation experiences are an antecedent to their information privacy concerns, including in the mobile context and at the online marketplace (Pavlou and Gefen, 2005 ; Belanger and Crossler, 2019 ). Regarding privacy concerns, prior studies widely demonstrated a significant antecedent to privacy protection intentions and protective behaviors. In addition, a meta-analysis found that users who worried about privacy were less likely to use internet services and were more likely to adopt privacy-protective actions (Baruh et al., 2017 ).

People make sense of the world based on their prior experiences (Floyd et al., 2000 ), while network users who have had privacy-invasive experiences tend to believe that the privacy risks are closely related to themselves (Li, 2008 ). They tend to be more aware of the seriousness and vulnerability of privacy issues (Mohamed and Ahmad, 2012 ). The effects of previous negative experiences on perceived vulnerability can also be explained by the availability heuristic, which assumes that the easier it is to retrieve experienced cases from memory, the higher the perceived frequency of the event. In contrast, when fewer cases are retrieved, people may estimate that the event is less likely to occur than in objective situations. Therefore, people's accumulated experiences of negative events might influence their perception of future vulnerability to risk (Tversky and Kahneman, 1974 ). However, in accordance with PMT, seriousness and vulnerability affect protective behavior in the context of social media privacy issues. Therefore, we can assume that the more memories of privacy violations people have, the more likely they are to believe that their privacy will be violated by privacy exposure, thereby increasing their motivation to protect privacy that is, their willingness to protect privacy. Therefore, this study proposed the following hypothesis:

  • H1: Privacy invasion experience is positively affecting protective privacy willingness.

PMT suggests that cognitive evaluation consists of assessing response costs (Rogers, 1975 ), and response costs refer to any costs, such as monetary, time, and effort (Floyd et al., 2000 ). According to findings from a health psychology study, when faced with the threat of skin cancer, people prefer to use sunscreen rather than avoid the sun (Jones and Leary, 1994 ; Wichstrom, 1994 ). It may be because of the lower response costs of utilizing sunscreen. These findings inspire us to believe that individuals calculate the response cost before they take protective actions. Privacy protection-related studies also indicate that prior experiences with personal information violations may significantly increase consumers' privacy concerns about both offline and online privacy and that privacy concerns are related to perceived risks (Okazaki et al., 2009 ; Bansal et al., 2010 ). It has also been shown that individuals who have experienced privacy invasion perceive a greater severity of risk (Petronio, 2002 ). Considering individuals' perceptions of risks affects their assessment of costs, which is part of the game between risks and benefits. In other words, a stronger risk perception indicates that higher response costs should be paid. Thus, this study assumed that people with more privacy violation experiences might perceive higher response costs and tend to take protective actions to avoid paying more. Consequently, this study made the following hypothesis:

  • H2a: A higher level of privacy-invasive experiences results in a higher perception of response costs.
  • H2b: A higher level of perception of response costs will result in higher privacy protection intentions.
  • H2c: Response cost mediates the effect of privacy-invasive experiences on privacy protection intentions.

2.2. Privacy invasion experiences, response costs, and privacy protection intentions

The medical community first introduced the concept of fatigue and referred to it as a subjective unpleasant feeling of tiredness (Piper et al., 1987 ). The concept of fatigue has been used in many research fields, such as clinical medicine (Mao et al., 2018 ), psychology, and more (Ong et al., 2006 ). In recent years, scholars also used the concept of “fatigue” in the study of social media and regarded it as an important antecedent to individual behaviors (Ravindran et al., 2014 ). Choi et al. ( 2018 ) defined “privacy fatigue” as a psychological state of fatigue caused by privacy issues. Specifically, “privacy fatigue” manifests itself as an unwillingness to actively manage and protect one's personal information and privacy (Hargittai and Marwick, 2016 ).

With the increasing severity of social network and personal information issues, the research around privacy fatigue, especially the examination of the antecedents and effects of privacy fatigue, has been widely developed. Regarding antecedents, scholars found that privacy concerns, self-disclosure, learning about privacy statements and information security, and the complexity of privacy protection practices could influence individuals' levels of privacy fatigue (Dhir et al., 2019 ; Oh et al., 2019 ). In terms of the effects, privacy fatigue can not only cause people to reduce the frequency of using social media or even withdraw from the Internet (Ravindran et al., 2014 ), but it can also motivate individuals to resist disclosing personal information (Keith et al., 2014 ); however, only a few studies examined privacy invasion experiences, privacy fatigue, and privacy protection intentions under one theoretical framework.

Furnell and Thomson ( 2009 ) pointed out that “privacy fatigue” is triggered by an individual's experience with privacy problems. Additionally, privacy fatigue has a boundary. When this boundary is crossed, social network users become bored with privacy management, leading them to abandon social network services. It has also been suggested that privacy data breaches can cause individuals to feel “disappointed.” In a study of medical data protection, the results showed that breaches of patients' medical data can have a cumulative effect on patients' behavioral decisions by causing them to perceive that their requests for privacy protection are being ignored (Juhee and Eric, 2018 ). The relationship between privacy invasion experiences and privacy fatigue has been widely demonstrated. Such social media characteristics as internet privacy threat experience and privacy invasion could lead to users' sense of emotional exhaustion and privacy cynicism, which was further associated with social media privacy fatigue (Xiao and Mou, 2019 ; Sheng et al., 2022 ). In terms of the outcomes, some other studies focusing on the privacy paradox found that emotional exhaustion and powerlessness (the same concept as exhaustion) would weaken the positive influence relationship between privacy concerns and their willingness to protect personal information (Tian et al., 2022 ). On account of the above reviews, it is reasonable to analogize that an individual's privacy invasion experience in the context of social media use can exacerbate an individual's perception of privacy fatigue. In other words, considering the social media privacy context, privacy fatigue may lead network users to abandon privacy protection behaviors and create opportunities for privacy invasion. Based on the above discussions, we proposed the following hypotheses:

  • H3a: Privacy invasion experiences positively affect privacy fatigue.
  • H3b: Privacy fatigue negatively affects privacy protection intentions.
  • H3c: Privacy fatigue has a masking (a form of mediating effect) role in the effects of individual social media privacy invasion experiences on privacy protection intentions.

As discussed above, we hypothesized that both response costs and privacy fatigue mediate the effect of social media users' privacy invasion experiences on their privacy protection intentions. Assuming that both response costs and privacy fatigue could mediate the effect of social media users' privacy invasion experiences on their privacy protection intentions, what is the association between response costs and privacy fatigue? It has been argued that a common shortcoming of current research applying PMT theory is that it ignores the role emotions play in this mechanism (Mousavi et al., 2020 ). This view is supported by Li's research, which argues that most research on privacy topics is conducted from a risk assessment perspective and tends to ignore the impact of emotions on privacy protection behaviors (Li et al., 2016 ). It was believed that emotions could change an individual's attention and beliefs (Friestad and Thorson, 1985 ). These factors are both related to behavioral intentions.

It has also been suggested that emotions play a mediating role in the process of behavioral decision-making (Tanner et al., 1991 ). However, only a few studies explored this influential mechanism to date. Zhang et al. ( 2022 ) found a positive influence between response costs and privacy fatigue. They conducted the research based on the Stressor-Strain-Outcome (S-S-O) framework to explore which factors (stressors) could cause privacy fatigue intentions (strain) and related behaviors (outcome). The results discovered that time cost and several other stressors significantly positively impact social media fatigue intention. As quoted from Floyd et al. ( 2000 ), “response costs” refer to any costs in which time costs were included. Despite an important reference to the above study's results provided for this study, the time cost is just one factor among response costs. This piece of research will focus on general response costs, assisting in a better understanding of this influential mechanism. Based on this, we proposed the following hypotheses:

  • H4a: Privacy response costs are positively associated with privacy fatigue.
  • H4b: Response costs and privacy fatigue play chain mediating roles in the effect of privacy invasion experiences on privacy protection intentions.

2.3. Regulatory focus as the moderator

Differences in individual psychological traits can lead to significant differences in individuals' cognition and behaviors (Benbasat and Dexter, 1982 ), and it has been shown that personal psychological traits can influence individuals' perceptions of fatigue (Dhir et al., 2019 ). A recent study also found that neuroticism has positive effects on privacy fatigue but that traits like agreeableness and extraversion have negative effects (Tang et al., 2021 ). However, previous research on social media privacy fatigue is relatively limited. Given the critical nature of privacy fatigue in research models, it is necessary to explore the differences in perceived fatigue among individuals with different psychological traits. This study introduced individual levels of regulatory focus as a moderator and examined the effect of privacy invasion experiences on privacy fatigue. Regulatory focus as a psychological trait was applied to explain social media users' privacy management and privacy protection problems (Wirtz and Lwin, 2009 ; Li et al., 2019 ).

Regulatory Focus Theory (RFT) classifies individuals into two different levels based on psychological traits: promotion focus, which focuses more on benefits and ignores potential risks, and prevention focus, which tends to avoid risks and ignore benefits when making decisions (Higgins, 1997 ). Research demonstrated that perceptions of benefits are supposed to reduce fatigue, while perceptions of risk could exacerbate fatigue (Boksem and Tops, 2008 ). By the same analogy, promotion-focused individuals are more inclined to notice the benefits of using social media (Jin, 2012 ) and thus may experience less fatigue and lower response costs when experiencing privacy violations; in contrast, individuals with a prevention focus are more aware of the risks associated with privacy invasion and thus have more concerns about privacy issues, which can lead to more feelings of fatigue and higher perceived response costs about privacy issues. Combined with H4, we can reason that the path of influence of social media privacy invasion experiences on privacy protection intentions may be affected by the level of individual regulatory focus. The effect of privacy invasion experiences on privacy fatigue and response costs was stronger for individuals who tended to be prevention focused than for those who tended to be promotion focused. Therefore, the mediating effect of privacy fatigue and response cost is stronger. In summary, this study proposed the hypotheses as follows:

  • H5a: Compared to promotion-focused users, the effect of privacy invasion experiences on privacy fatigue is greater for prevention-focused users.
  • H5b: Compared to prevention-focused users, the effect of privacy invasion experiences on response costs is greater for promotion-focused users.

2.4. Current study

In summary, the current study concluded that, in the social media context, users' experiences of privacy invasion would increase their perception of response costs and thus result in privacy fatigue. Privacy fatigue decreases individuals' privacy protection intentions. However, this process differed for individuals with different regulatory focuses. In detail, individuals with a promotion focus are less likely to experience privacy fatigue than individuals with a prevention focus. Based on the above logic, the conceptual model constructed in this study is shown in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-1031592-g0001.jpg

Conceptual model.

3. Materials and methods

3.1. participants and procedures.

This survey was conducted in December 2021, and Zhejiang Lab collected the data. The questionnaire was pretested with a small group of participants to ensure the questions were clearly phrased. Participants were informed of their right to withdraw and were assured of confidentiality and anonymity before participating in this research survey. Computers, tablets, and mobile phones were all used to complete the cross-sectional survey. After giving their consent, participants were asked to complete the following scales. After the screening, 4,800 valid questionnaires were selected. The invalid questionnaires were deleted mainly based on not passing the test of the screening questions rather than not answering the questions carefully (e.g., the answers to the questions of several consecutive variables are the same, or the number of repeated options is >70%).

To guarantee data quality and reduce possible interference from gender and geographical factors, this survey used a quota sampling method, as shown in Table 1 , with a sample gender ratio of 1:1 and samples from 16 cities in China, with 300 valid samples in each city. Considering the possible relationship between the privacy invasion experience and the years of Internet usage, participants' previous privacy invasion experience is meaningful to this study, and the final sample had 34.5 and 57.3% of Internet usage between 5 and 10 years and more than 10 years, respectively, which met the requirements of the study. In terms of education level, college and bachelor's degrees accounted for the largest proportion, at 62.0%, followed by high school/junior high school and vocational high school, at 27.3%. In terms of the age of the sample, the ratio of those younger than 46 years old to those above was 59.7:40.3 with a balanced distribution among all age groups. The basic demographic variables are tabulated as shown in Table 1 .

Statistical table of basic information on effective samples.

GenderMen24,0050.0
Women24,0050.0
Age18~253577.4
26~351,57332.8
36~4593619.4
Over 461,93440.3
Educational backgroundUnder High School3567.4
High School1,30827.3
Undergraduate2,97562.0
Master and Doctor1613.4
Internet life timeLess than 3 years340.7
3~5 years3567.4
5~10 years1,65834.5
Over 10 years2,75257.3

3.2. Measurements

Based on the model and hypotheses of this study, the instruments of this study included measures of privacy invasion experiences, response costs, privacy fatigue, privacy protection intentions, and regulatory focus (including promotion focus and prevention focus). This study's questionnaire was designed on scales that have been pre-validated. All scales were adapted based on social media contexts, and all responses were graded on a Likert scale ranging from 0 (strongly disagree) to 6 (strongly agree). A higher score was a better fit for that measure. Sub-items within each scale were averaged, resulting in composite scores.

The privacy invasion experiences scale was referenced from Su's study (Su et al., 2018 ). The scale is a 3-item self-reported scale (e.g., “My personal information, such as my phone number, shopping history, and more, is used to be shared by intelligent media with third-party platforms.”). The response cost scale was developed from the scale in the study by Yoon et al. ( 2012 ), which included three measurement questions (e.g., “When personal information security is at risk on social media, I consider that taking practical action will take too much time and effort.”). The privacy fatigue scale was derived from a related study by Choi et al. ( 2018 ), and the current study applied this 4-item scale to measure privacy fatigue on social media (e.g., “Dealing with personal information protection issues on social media makes me tired.”). The privacy protection intention scale was based on the scale developed by Liang and Xue ( 2010 ), which contains three measurement items (e.g., “When my personal information security is threatened on social media, I am willing to make efforts to protect it.”). The regulatory focus scale was derived from the original scale developed by Higgins ( 2002 ) and later adapted by Chinese scholars for use with Chinese samples (Cui et al., 2014 ). The scale contains six items on measures for promotion focus (e.g., “For what I want to do, I can do it all well”) and four items on measures for prevention focus (e.g., “While growing up, I often did things that my parents didn't agree were right”). The regulatory focus was measured by subtracting the average prevention score from the average promotion score, with higher differences indicating a greater tendency toward promotion focus and lower differences indicating a greater tendency toward prevention focus (Cui et al., 2014 ).

3.3. Data analysis

The validity and reliability of our questionnaire were tested using Mplus8. The PROCESS macro for SPSS was used to evaluate the moderated chain mediation model with the bootstrapping method (95 percent CI, 5,000 samples). Gender (1 = men, 0 = women), age, the highest degree obtained, and Internet lifetime are among the covariates examined in this model.

4.1. Measurement of the model

As shown in Table 2 , privacy invasion experiences, response costs, privacy fatigue, and privacy protection intentions are all factors to consider. Cronbach's α and composite reliability of scales are higher than the acceptable value (>0.70). Although the Cronbach's α for promotion and prevention focus were slightly <0.70, they were >0.60 and close to 0.70, which was also considered permissible due to the large sample size of this study, and the reliability test of the measurement model in this study was qualified (Hair et al., 2019 ).

Results of the validity and reliability.

1. PIE 0.7240.7730.767
2. RC0.468 0.5940.8620.862
3. PF0.4570.538 0.7840.8570.856
4. PPI0.1060.075−0.153 0.5180.7510.750
5. Pro Focus0.0510.020−0.0930.451 0.4200.6830.693
6. Pre Focus0.3380.2870.449−0.030−0.002 0.4420.7030.697

PIE, privacy invasion experiences; PC, response costs; PF, privacy fatigue; PPI, privacy protection intentions. Bold value is the square root of AVE.

Since the measurement instruments in this study were derived from validated scales, the average variance extracted (AVE) was higher than 0.5, but we can accept 0.4. According to Fornell and Larcker ( 1981 ), if the AVE is <0.5, but the composite reliability is higher than 0.6, the construct's convergent validity is still acceptable (Fornell and Larcker, 1981 ). Further, Lam ( 2012 ) also explained and confirmed this view (Lam, 2012 ). Discriminant validity was tested by comparing the square root of AVE with the correlations of the researched variables. The square root of the AVE was higher than the correlation, indicating good discriminant validity.

Then, we tested the goodness of fit indices. Confirmatory factor analysis (CFA) of our questionnaire produced acceptable fit values for the one-dimensional factor structure (RMSEA = 0.048 0.15, SRMR = 0.042 0.05, GFI = 0.955 > 0.9, CFI = 0.947 > 0.9, NFI = 0.943 > 0.9, and 948 = 0.945 > 0.9) after introducing the error covariances in the model. In summary, the current study passed the reliability and validity tests.

4.2. Descriptive statistics

Table 3 shows the descriptive statistics and correlation analysis results. Response costs, privacy fatigue, and privacy protection intentions were all positively correlated with privacy invasion experiences. Privacy fatigue and privacy protection intentions were both positively correlated with response costs. Private fatigue was found to be negatively related to privacy protection intentions.

Means, standard deviations, and correlations among research variables.

1. PIE3.5251.3041
2. RC3.7971.4410.468 1
3. PF2.8071.4770.457 0.538 1
4. PPI4.6360.8820.106 0.075 −0.153 1
5. RF1.6371.476−0.265 −0.239 −0.440 0.271 1

PIE, privacy invasion experiences; PC, response costs; PF, privacy fatigue; PPI, privacy protection intentions; RF, regulatory focus; ** p < 0.01.

4.3. Relationship between privacy invasion experience and privacy protection intentions

Table 4 shows the results of the polynomial regression analysis. Privacy invasion experiences significantly influenced levels of response costs (β = 0.466, SE = 0.023, t = 11.936, p = 0.000), privacy fatigue (β = 0.297, SE = 0.022, t = 13.722, p = 0.000), and privacy protection intentions (β = 0.133, SE = 0.011, t = 12.382, p = 0.000) after controlling for gender, highest degree obtained, age, and Internet lifetime. Response costs positively predicted privacy fatigue (β = 0.382, SE = 0.013, t = 29.793, p = 0.000) and privacy protection intention (β = 0.098, SE = 0.010, t = 9.495, p = 0.000). However, privacy fatigue was significantly negatively correlated with privacy protection intentions (β = −0.130, SE = 0.011, t = −12.303, p = 0.000) in this model. In conclusion, H1, H2a, H2b, H3a, H3b, and H4a were supported.

Multiple regression results of the moderated mediation model.

PIE0.1330.01112.3820.000 0.134106.295
PF−0.1300.011−12.3030.000
RC0.0980.0109.4950.000
PIE0.2970.02213.7220.000 0.427446.246
RC0.3820.01329.7930.000
RF−0.1010.040−2.5100.0121
PIE × RF−0.0310.008−4.1030.000
PIE0.4660.02311.9360.000 0.234209.354
RF−0.1430.046−3.1380.0017
PIE × RF0.0070.0090.8400.401

PIE, privacy invasion experiences; PC, response costs; PF, privacy fatigue; PPI, privacy protection intentions; RF, regulatory focus; * p < 0.05; ** p < 0.01; *** p < 0.001; β, unstandardized regression weight; SE, standard error for the unstandardized regression weight; t, t-test statistic; F, F-test statistic.

Then, we used Model 6 of PROCESS to test the mediating effect in our model. As the results in Table 5 , H2c, H3c, and H4b were accepted.

Results of mediating effect test.

Indirect effectPIE → RC → PPI0.0530.0410.065
PIE → PF → PPI−0.057−0.065−0.049
PIE → RC → PF → PPI−0.042−0.048−0.037
Total indirect effect−0.047−0.059−0.035

PIE, privacy invasion experiences; PC, response costs; PF, privacy fatigue; PPI, privacy protection intentions.

Model 84 in the SPSS PROCESS macro is applied to carry out the bootstrapping test to examine the moderation effect of regulatory focus. Privacy invasion experiences, response costs, privacy fatigue, and regulatory focus were centralized before constructing the interaction term. The results showed that regulatory focus significantly moderated the effect of privacy invasion experiences on privacy fatigue [95% Boot CI = (0.002, 0.006), and H5a was supported. In addition, the mediating effect was significant at a low level of regulatory focus (−1 SD; Effec t = −0.038; 95% Boot CI = (−0.046, −0.030)], medium level of regulatory focus [Effec t = −0.032; 95% Boot CI = (−0.039, −0.026)] and high level of regulatory focus [+1 SD; Effec t = −0.026; 95% Boot CI = (−0.032, 0.020)]. Specifically, the mediating effect of privacy fatigue decreased as individuals increasingly tended to be promotion focused. However, the regulatory focus did not significantly moderate the effect of privacy invasion experiences on response costs [95% Boot CI = (−0.001, 0.003)], and H5b was rejected.

Meanwhile, privacy invasion experiences × regulatory focus interaction significantly predicted privacy fatigue (β = −0.046, SE = 0.008, t = −3.694, p = 0.000; see Figure 2 ). The influence of privacy invasion experiences on privacy fatigue was significant when the level of regulatory focus was high (β = 0.385, SE = 0.016, t = 23.981, p = 0.000), medium (β = 0.430, SE = 0.015, t = 29.415, p = 0.000), and low (β = 0.475, SE = 0.022, t = 22.061, p = 0.000). Specifically, the more the individuals tended to be promotion focused (high regulatory focus scores), the less the level of fatigue caused by privacy invasion, and the more the individuals tended to be prevention focused (low regulatory focus scores), the more the level of fatigue was caused by privacy invasion.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-1031592-g0002.jpg

Simple slope test of the interaction between PIE and RF on the PF.

5. Discussion

The purpose of the present study was to explore the relationship among privacy invasion experiences, response costs, privacy fatigue, privacy protection intentions, and regulatory focus. This study showed that response costs and privacy fatigue play mediating roles, whereas regulatory focus plays a moderating role in this process (as shown in Figure 3 ). These findings help clarify how and under which circumstances social media users' privacy invasion experiences affect their privacy protection intentions, thereby providing a means to improve people's privacy situation on social media platforms.

An external file that holds a picture, illustration, etc.
Object name is fpsyg-13-1031592-g0003.jpg

The moderated chain mediation model. Dashed lines represent nonsignificant relations *** p < 0.001.

5.1. A chain mediation of response costs and privacy fatigue

The current study found that social media users' privacy invasion experiences have a significant positive effect on their response costs, and the increase in response costs will in turn increase individuals' privacy protection intentions. This finding was consistent with previous literature on health psychology, which found that individuals calculate response costs for different actions before making decisions. The higher the response costs individuals perceive, the greater the possibility that they will improve their protective intention (Jones and Leary, 1994 ; Wichstrom, 1994 ). Compared with users who experienced less privacy invasion on social media, people who experienced more privacy violations would perceive a higher level of response costs, which would further increase their protective intention to avoid dealing with the negative outcomes followed by privacy invasion.

The study also found that social media users' privacy invasion experiences had a significant positive effect on privacy fatigue, which is consistent with prior research on social media use (Xiao and Mou, 2019 ; Sheng et al., 2022 ). At the same time, response costs also positively affected privacy fatigue, and research on social media fatigue behaviors indicated this influential mechanism in the past (Zhang et al., 2022 ). However, this study additionally found that response costs partially mediated the effect of privacy invasion experiences on privacy fatigue. Although both increased privacy invasion experiences and increased response costs will improve social media users' privacy protection intentions, privacy fatigue can mask this process, i.e., increased privacy fatigue reduces individuals' privacy protection intentions.

Moreover, this study revealed that response costs and privacy fatigue play chain-mediated roles in the effect of social media privacy invasion experiences on privacy protection intentions and further explained the mechanism. In addition, the masking effect of privacy fatigue also explains why privacy invasion experiences do not have a strong effect on privacy protection intentions. In other words, this privacy fatigue is an important reason that people currently “lie flat” (adopt passive protection) in the face of privacy-invasive issues online.

5.2. Regulatory focus as moderator

The relationship between social media privacy invasion experiences and privacy fatigue was moderated by regulatory focus. To be more specific, the more the people who promoted their privacy, the less the level of privacy fatigue they felt; the more the people who prevented their privacy, the more the level of privacy fatigue they felt. In other words, promotion focus acts as a buffer in this process. In other words, promotion focus has a buffering effect in this process. To some extent, the result of this study verified that different regulated individuals would sense different levels of fatigue due to their pursuing benefits or avoiding risks when they make decisions (Boksem and Tops, 2008 ; Jin, 2012 ). On the other hand, the regulatory focus did not moderate the relationship between privacy invasion experiences and response costs. One possible explanation is that, compared with privacy fatigue, response costs to privacy violations are based on exact experiences in users' memories. Individuals who have had more privacy invasions have more experience dealing with the negative consequences of privacy violations. Thus, whether psychological traits were added or not, the effect of privacy-invasive experiences on response costs would not be strengthened or weakened.

Meanwhile, this study has proven a moderated mediation model investigating the moderating role of regulatory focus in mediating “privacy invasion experiences—privacy fatigue—privacy protection intentions.” The results indicated that, as individuals tend to be prevention focused, privacy invasion experiences affect individuals' privacy protection intentions through the mediating role of privacy fatigue; specifically, the more they tend to be prevention focused, the stronger their privacy fatigue and the weaker their privacy protection intentions. Therefore, interventions for privacy fatigue (e.g., improving media literacy, creating a better online environment, and more) can be used to enhance social media users' privacy protection intentions (Bucher et al., 2013 ; Agozie and Kaya, 2021 ). In particular, focusing on social media users who tend to be prevention focused is crucial.

5.3. Implication

From a theoretical perspective, our study found a mechanism for influencing privacy-protective behavior based on an extension of the protective motivation theory. Protection motivation theory is a fear-based theory. We used our experiences with social media privacy invasions as a source of fear. Based on this, we found that these experiences were associated with individuals' privacy protection intentions. We explained the mechanism through the mediating variable of response costs, which is also consistent with previous findings (Chen et al., 2016 ).

More importantly, however, in response to what previous researchers have argued is an emotional factor that traditional protection motivation theory ignores (Mousavi et al., 2020 ), our study extended traditional protection motivation theory to include privacy fatigue as a factor and verified that fatigue significantly reduces social media users' privacy protection intentions. The introduction of “privacy fatigue” can better explain why occasional privacy invasion experiences do not cause privacy-protective behaviors, which is another possible explanation for the privacy paradox in addition to the traditional privacy calculus theory. The introduction of “privacy fatigue” has also inspired researchers to pay attention to individual emotions in privacy research. This study also compared differences in privacy protection intentions among social media users of different regulatory focus types, which are mainly caused by fatigue rather than response costs. By combining privacy fatigue and regulatory focus, it was found that not all subjects felt the same level of privacy fatigue after experiencing privacy invasion. This study also expanded the application of both privacy fatigue and regulatory focus theories and built a bridge between online privacy research and regulatory focus theory.

In addition to the aforementioned implications for research and theory, the findings also have some useful, practical implications. First of all, the findings of this piece ask for measures to reduce privacy invasion on social media. (a) Reducing the incidence of privacy violations at their root requires improving the current online privacy environment on social media platforms. We call on the government to strengthen the regulation of online privacy and social media platforms to reinforce the protection of users' privacy. To a large extent, users' personal information should not be misused. (b) From the social media agent perspective, relevant studies mentioned that content relevance perceived by online users could mitigate the negative relations between privacy invasion and continuous use intention (Zhu and Chang, 2016 ). Social media agents should improve their efficiency in using qualified personal information, giving users a smoother experience on online platforms.

Second, the results show that privacy fatigue could affect users' privacy protection intentions. (c) According to Choi et al. ( 2018 ), users have a tolerance threshold for privacy fatigue. The policy should formulate an acceptable level of privacy protection. Other scholars suggested that online service providers should avoid excessively or unnecessarily collecting personal information and forbid sharing or selling users' personal information strictly with any third party without their permission (Tang et al., 2021 ). (d) Another effective way is to reduce response costs to reduce the costs of protecting one's privacy. For example, social media platforms can optimize privacy interfaces and management tools or provide more effective feedback mechanisms for users. (e) In addition, improving users' privacy literacy (especially for prevention-focused individuals) can also be effective in reducing privacy fatigue (Bucher et al., 2013 ).

Finally, different measures should be applied based on different regulatory-focused users. (f) Social media managers could further classify users into groups based on their psychological characteristics and manage them in accordance with their requirements for the level of privacy protection. Thereby, social media users may have a wider range of choices. Specifically, due to previous privacy invasive experience, prevention-focused individuals tend to feel more privacy fatigue, requiring additional privacy protection features for prevention-focused users. For example, social media platforms could offer specific explanations of privacy protection technologies to increase prevention-focused individuals' trust in privacy protection technologies.

5.4. Limitations and future directions

There are still some limitations present in this article. Firstly, this study solely selected response costs as individuals' cognitive process, whereas threat appraisal was also included in the cognitive process of protection motivation theory, which focused on the potential outcomes of risky behaviors, including perceived vulnerability, perceived severity of the risk, and rewards associated with risky behavior (Prentice-Dunn et al., 2009 ). Future studies could systematically consider the association between these factors and privacy protection intentions. Second, users' perceptions of privacy invasion are different across various social media platforms (e.g., Instagram and Facebook), and this study only applies to a generalized social media context. Future research could pay more attention to the differences among users on different social media platforms (with different functions). Finally, this study did not focus on specific privacy invasion experiences. However, studies pointed out that different types of privacy invasions affect people differently. Moreover, people with different demographical backgrounds, such as cultural backgrounds and gender, would react differently when faced with the same situation (Klein and Helweg-Larsen, 2002 ). Future research can investigate this in more depth through experiments.

6. Conclusion

In conclusion, our findings suggest that social media privacy invasion experiences increase individuals' privacy protection intentions by increasing their response costs, but e increase in privacy fatigue masks this effect. Pivacy fatigue is a barrier to increasing social media users' willingness to protect their privacy, which explains why users do not seem to show a stronger willingness to protect their privacy when privacy invasion is a growing problem in social networks nowadays. Our study also revealed a different level of fatigue that individuals with different levels of regulatory focus exhibit when faced with the same level of privacy invasion experience. In particular, prevention-focused social media users are more likely to become fatigued. Therefore, social media agents should pay special attention to these individuals because they may be particularly vulnerable to privacy violations. Furthermore, the current research on privacy fatigue has yet to be expanded, and future researchers can add to it.

Our theoretical analysis and empirical results further emphasize the distinction between individuals, a differentiation that allows researchers to align their analyses with theoretical hypotheses more tightly. This applies not only to research on the effects of privacy invasion experiences on privacy behavior but also to exploring other privacy topics. Therefore, we recommend that future privacy research be more human-oriented, which will also benefit the current “hierarchical governance” of the Internet privacy issue.

Data availability statement

Ethics statement.

This study was approved by the Academic Committee of the School of Journalism and Communication at Xiamen University, and we carefully verified that we complied strictly with the ethical guidelines.

Author contributions

CG is responsible for the overall research design, thesis writing, collation of the questionnaire, and data analysis. SC and ML are responsible for the guidance. JW is responsible for the proofreading and article touch-up. All authors contributed to the article and approved the submitted version.

Acknowledgments

The authors thank all the participants of this study. The participants were all informed about the purpose and content of the study and voluntarily agreed to participate. The participants were able to stop participating at any time without penalty.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1031592/full#supplementary-material

  • Agozie D. Q., Kaya T. (2021). Discerning the effect of privacy information transparency on privacy fatigue in e-government . Govern. Inf. Q . 38, 101601. 10.1016/j.giq.2021.101601 [ CrossRef ] [ Google Scholar ]
  • Baby A., Kannammal A. (2020). Network Path Analysis for developing an enhanced TAM model: A user-centric e-learning perspective . Comput. Hum. Behav . 107, 24. 10.1016/j.chb.2019.07.024 [ CrossRef ] [ Google Scholar ]
  • Bansal G., Zahedi F. M., Gefen D. (2010). The impact of personal dispositions on information sensitivity, privacy concern and trust in disclosing health information online . Decision Support Syst . 49 , 138–150. 10.1016/j.dss.2010.01.010 [ CrossRef ] [ Google Scholar ]
  • Baruh L., Secinti E., Cemalcilar Z. (2017). Online privacy concerns and privacy management: a meta-analytical review . J. Commun. 67 , 26–53. 10.1111/jcom.12276 [ CrossRef ] [ Google Scholar ]
  • Belanger F., Crossler R. E. (2019). Dealing with digital traces: Understanding protective behaviors on mobile devices . J. Strat. Inf. Syst. 28 , 34–49. 10.1016/j.jsis.2018.11.002 [ CrossRef ] [ Google Scholar ]
  • Benbasat I., Dexter A. S. (1982). Individual differences in the use of decision support aids . J. Account. Res . 20 , 1–11. 10.2307/2490759 [ CrossRef ] [ Google Scholar ]
  • Boksem M. A. S., Tops M. (2008). Mental fatigue: costs and benefits . Brain Res. Rev. 59 , 125–139. 10.1016/j.brainresrev.2008.07.001 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bucher E., Fieseler C., Suphan A. (2013). The stress potential of social media in the workplace . Inf. Commun. Soc . 16 , 1639–1667. 10.1080/1369118X.2012.710245 [ CrossRef ] [ Google Scholar ]
  • Chen H., Beaudoin C. E., Hong T. (2015). Teen online information disclosure: Empirical testing of a protection motivation and social capital model . J. Assoc. Inf. Sci. Technol . 67 , 2871–2881. 10.1002/asi.23567 [ CrossRef ] [ Google Scholar ]
  • Chen H., Beaudoin C. E., Hong T. (2016). Protecting oneself online: The effects of negative privacy experiences on privacy protective behaviors . J. Mass Commun. Q. . 93 , 409–429. 10.1177/1077699016640224 [ CrossRef ] [ Google Scholar ]
  • Cho H., Roh S., Park B. (2019). Of promoting networking and protecting privacy: effects of defaults and regulatory focus on social media users' preference settings . Comput. Hum. Behav . 101 , 1–13. 10.1016/j.chb.2019.07.001 [ CrossRef ] [ Google Scholar ]
  • Choi H., Park J., Jung Y. (2018). The role of privacy fatigue in online privacy behavior . Comput. Hum. Behav . 81 , 42–51. 10.1016/j.chb.2017.12.001 [ CrossRef ] [ Google Scholar ]
  • Cui Q., Yin C. Y., Lu H. L. (2014). The reaction of consumers to others' assessments under different social distance . Chin. J. Manage . 11 , 1396–1402. [ Google Scholar ]
  • Dhir A., Kaur P., Chen S., Pallesen S. (2019). Antecedents and consequences of social media fatigue . Int. J. Inf. Manage . 8 , 193–202. 10.1016/j.ijinfomgt.2019.05.021 [ CrossRef ] [ Google Scholar ]
  • Floyd D. L., Prentice-Dunn S., Rogers R. W. A. (2000). meta-analysis of research on protection motivation theory . J. Appl. Soc. Psychol . 30 , 407–429. 10.1111/j.1559-1816.2000.tb02323.x [ CrossRef ] [ Google Scholar ]
  • Fornell C., Larcker D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error . J. Market. Res . 18 , 39–50. 10.1177/002224378101800104 [ CrossRef ] [ Google Scholar ]
  • Friestad M., Thorson E. (1985). The Role of Emotion in Memory for Television Commercials . Washington, DC: Educational Resources Information Center. [ Google Scholar ]
  • Furnell S., Thomson K. L. (2009). Recognizing and addressing “security fatigue” . Comput. Fraud Secur . 11 , 7–11. 70139-3 10.1016/S1361-3723(09)70139-3 [ CrossRef ] [ Google Scholar ]
  • Hair J. F., Ringle C. M., Gudergan S. P. (2019). Partial least squares structural equation modeling-based discrete choice modeling: an illustration in modeling retailer choice . Bus. Res . 12 , 115–142. 10.1007/s40685-018-0072-4 [ CrossRef ] [ Google Scholar ]
  • Hargittai E., Marwick A. (2016). “What can I really do?” Explaining the privacy paradox with online apathy . Int. J. Commun. 10 , 21. 1932–8036/20160005. [ Google Scholar ]
  • Higgins E. T. (1997). Beyond pleasure and pain . Am. Psychol . 52 , 1280–1300. 10.1037/0003-066X.52.12.1280 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Higgins E. T. (2002). How self-regulation creates distinct values: the case of promotion and prevention decision making . J. Consum. Psychol . 12 , 177–191. 10.1207/S15327663JCP1203_01 [ CrossRef ] [ Google Scholar ]
  • Hsu C. L., Park S. J., Park H. W. (2013). Political discourse among key Twitter users: the case of Sejong city in South Korea . J. Contemp. Eastern Asia . 12 , 65–79. 10.17477/jcea.2013.12.1.065 [ CrossRef ] [ Google Scholar ]
  • Jin S. A. A. (2012). To disclose or not to disclose, that is the question: A structural equation modeling approach to communication privacy management in e-health . Comput. Hum. Behav . 28 , 69–77. 10.1016/j.chb.2011.08.012 [ CrossRef ] [ Google Scholar ]
  • Jones J. L., Leary M. R. (1994). Effects of appearance-based admonitions against sun exposure on tanning intentions in young-adults . Health Psychol. 13 , 86–90. 10.1037/0278-6133.13.1.86 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Juhee K. Eric J. The Market Effect of Healthcare Security: Do Patients Care About Data Breaches? (2018). Available online at: https//www.econinfosec.org/archive/weis2015/papers/WEIS_2015_kwon.pdf (accessed October 30, 2018).
  • Keith M. J., Maynes C., Lowry P. B., Babb J. (2014). “Privacy fatigue: the effect of privacy control complexity on consumer electronic information disclosure,” in International Conference on Information Systems (ICIS 2014) , Auckland , 14–17. [ Google Scholar ]
  • Kisekka V., Giboney J. S. (2018). The effectiveness of health care information technologies: evaluation of trust, security beliefs, and privacy as determinants of health care outcomes . J. Med. Int. Res . 20, 9014. 10.2196/jmir.9014 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Klein C. T., Helweg-Larsen M. (2002). Perceived control and the optimistic bias: a meta-analytic review . Psychol. Health . 17 , 437–446. 10.1080/0887044022000004920 [ CrossRef ] [ Google Scholar ]
  • Lai Y. L., Hui K. L. (2006). “Internet opt-in and opt-out: Investigating the roles of frames, defaults and privacy concerns,” in Proceedings of the 2006 ACM SIGMIS CPR Conference on Computer Personnel Research . New York, NY: ACM , 253–263. [ Google Scholar ]
  • Lam L. W. (2012). Impact of competitiveness on salespeople's commitment and performance . J. Bus. Res. 65 , 1328–1334. 10.1016/j.jbusres.2011.10.026 [ CrossRef ] [ Google Scholar ]
  • Li H., Wu J., Gao Y., Shi Y. (2016). Examining individuals' adoption of healthcare wearable devices: an empirical study from privacy calculus perspective . Int. J. Med. Inf. 88 , 8–17. 10.1016/j.ijmedinf.2015.12.010 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Li P., Cho H., Goh Z. H. (2019). Unpacking the process of privacy management and self-disclosure from the perspectives of regulatory focus and privacy calculus . Telematic. Inf. 41 , 114–125. 10.1016/j.tele.2019.04.006 [ CrossRef ] [ Google Scholar ]
  • Li X. (2008). Third-person effect, optimistic bias, and sufficiency resource in Internet use . J. Commun . 58 , 568–587. 10.1111/j.1460-2466.2008.00400.x [ CrossRef ] [ Google Scholar ]
  • Liang H., Xue Y. L. (2010). Understanding security behaviors in personal computer usage: a threat avoidance perspective . J. Assoc. Inf. Syst . 11 , 394–413. 10.17705/1jais.00232 [ CrossRef ] [ Google Scholar ]
  • Mao H., Bao T., Shen X., Li Q., Seluzicki C., Im E. O., et al.. (2018). Prevalence and risk factors for fatigue among breast cancer survivors on aromatase inhibitors . Eur. J. Cancer . 101 , 47–54. 10.1016/j.ejca.2018.06.009 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • McLeod A., Dolezel D. (2022). Information security policy non-compliance: can capitulation theory explain user behaviors? Comput. Secur . 112, 102526. 10.1016/j.cose.2021.102526 [ CrossRef ] [ Google Scholar ]
  • Mohamed N., Ahmad I. H. (2012). Information privacy concerns, antecedents and privacy measure use in social networking sites: evidence from Malaysia . Comput. Hum. Behav . 28 , 2366–2375. 10.1016/j.chb.2012.07.008 [ CrossRef ] [ Google Scholar ]
  • Mousavi R., Chen R., Kim D. J., Chen K. (2020). Effectiveness of privacy assurance mechanisms in users' privacy protection on social networking sites from the perspective of protection motivation theory . Decision Supp. Syst . 135, 113323. 10.1016/j.dss.2020.113323 [ CrossRef ] [ Google Scholar ]
  • Oh J., Lee U., Lee K. (2019). Privacy fatigue in the internet of things (IoT) environment . INPRA 6 , 21–34. [ Google Scholar ]
  • Okazaki S., Li H., Hirose M. (2009). Consumer privacy concerns and preference for degree of regulatory control . J. Adv. 38 , 63–77. 10.2753/JOA0091-3367380405 [ CrossRef ] [ Google Scholar ]
  • Ong A. D., Bergeman C. S., Bisconti T. L., Wallace K. A. (2006). Psychological resilience, positive emotions, and successful adaptation to stress in later life . J. Pers. Soc. Psychol. 91 , 730. 10.1037/0022-3514.91.4.730 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Pavlou P. A., Gefen D. (2005). Psychological contract violation in online marketplaces: antecedents, consequences, and moderating role . Inf. Syst. Res. 16 , 372–399. 10.1287/isre.1050.0065 [ CrossRef ] [ Google Scholar ]
  • Petronio S. (2002). Boundaries of Privacy: Dialectics of Disclosure . Albany, NY: State University of New York Press. [ Google Scholar ]
  • Piper B. F., Lindsey A. M., Dodd M. J. (1987). Fatigue mechanisms in cancer patients: developing nursing theory . Oncol. Nurs. Forum . 14, 17. [ PubMed ] [ Google Scholar ]
  • Prentice-Dunn S., Mcmath B. F., Cramer R. J. (2009). Protection motivation theory and stages of change in sun protective behavior . J. Health Psychol . 14 , 297–305. 10.1177/1359105308100214 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Price B. A., Adam K., Nuseibeh B. (2005). Keeping ubiquitous computing to yourself: a practical model for user control of privacy . Int. J. Hum. Comput. Stu. 63 , 228–253. 10.1016/j.ijhcs.2005.04.008 [ CrossRef ] [ Google Scholar ]
  • Ravindran T., Yeow Kuan A. C., Hoe Lian D. G. (2014). Antecedents and effects of social network fatigue . J. Assoc. Inf. Sci. Technol . 65 , 2306–2320. 10.1002/asi.23122 [ CrossRef ] [ Google Scholar ]
  • Revell T. (2019). Facebook Must Come Clean and Hand Over Election Campaign Data. New Scientist . Available online at: https://www.newscientist.com/article/mg24332472-300-face-book-must-come-clean-and-hand-over-election-campaign-data/ (accessed September 11, 2019).
  • Rogers R. W. A. (1975). protection motivation theory of fear appeals and attitude change . J. Psychol . 91 , 93–114. 10.1080/00223980.1975.9915803 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sheng N., Yang C., Han L., Jou M. (2022). Too much overload and concerns: antecedents of social media fatigue and the mediating role of emotional exhaustion . Comput. Hum. Behav. 139 , 107500. 10.1016/j.chb.2022.107500 [ CrossRef ] [ Google Scholar ]
  • Su P., Wang L., Yan J. (2018). How users' internet experience affects the adoption of mobile payment: a mediation model . Technol. Anal. Strat. Manage . 30 , 186–197. 10.1080/09537325.2017.1297788 [ CrossRef ] [ Google Scholar ]
  • Tang J., Akram U., Shi W. (2021). Why people need privacy? The role of privacy fatigue in app users' intention to disclose privacy: based on personality traits . J. Ent. Inf. Manage . 34 , 1097–1120. 10.1108/JEIM-03-2020-0088 [ CrossRef ] [ Google Scholar ]
  • Tanner J. F., Hunt J. B., Eppright D. R. (1991). The protection motivation model: a normative model of fear appeals . J. Market . 55 , 36–45. 10.1177/002224299105500304 [ CrossRef ] [ Google Scholar ]
  • Tian X., Chen L., Zhang X. (2022). The role of privacy fatigue in privacy paradox: a psm and heterogeneity analysis . Appl. Sci. 12 , 9702. 10.3390/app12199702 [ CrossRef ] [ Google Scholar ]
  • Tversky A., Kahneman D. (1974). Judgement under uncertainty: heuristics and biases . Science . 185 , 1124–1131. 10.1126/science.185.4157.1124 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang L., Yan J., Lin J., Cui W. (2017). Let the users tell the truth: Self-disclosure intention and self-disclosure honesty in mobile social networking . Int. J. Inf. Manage . 37 , 1428–1440. 10.1016/j.ijinfomgt.2016.10.006 [ CrossRef ] [ Google Scholar ]
  • Wichstrom L. (1994). Predictors of Norwegian adolescents sunbathing and use of sunscreen . Health Psychol. 13 , 412–420. 10.1037/0278-6133.13.5.412 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wirtz J., Lwin M. O. (2009). Regulatory focus theory, trust, and privacy concern . J. Serv. Res . 12 , 190–207. 10.1177/1094670509335772 [ CrossRef ] [ Google Scholar ]
  • Wu Z., Xie J., Lian X., Pan J. (2019). A privacy protection approach for XML-based archives management in a cloud environment . Electr. Lib . 37 , 970–983. 10.1108/EL-05-2019-0127 [ CrossRef ] [ Google Scholar ]
  • Xiao L., Mou J. (2019). Social media fatigue -Technological antecedents and the moderating roles of personality traits: the case of WeChat . Comput. Hum. Behav . 101 , 297–310. 10.1016/j.chb.2019.08.001 [ CrossRef ] [ Google Scholar ]
  • Xu F., Michael K., Chen X. (2013). Factors affecting privacy disclosure on social network sites: an integrated model . Electr. Comm. Res 13 , 151–168. 10.1007/s10660-013-9111-6 [ CrossRef ] [ Google Scholar ]
  • Yoon C., Hwang J. W., Kim R. (2012). Exploring factors that influence students' behaviors in information security . J. Inf. Syst. Educ . 23 , 407–415. [ Google Scholar ]
  • Youn S., Kim S. (2019). Newsfeed native advertising on Facebook. Young millennials' knowledge, pet peeves, reactance and ad avoidance . Int. J. Adv . 38 , 651–683. 10.1080/02650487.2019.1575109 [ CrossRef ] [ Google Scholar ]
  • Zhang Y., He W., Peng L. (2022). How perceived pressure affects users' social media fatigue behavior: a case on WeChat . J. Comput. Inf. Syst . 62 , 337–348. 10.1080/08874417.2020.1824596 [ CrossRef ] [ Google Scholar ]
  • Zhou Y., Schaub F. (2018). “Concern but no action: consumers, reactions to the equifax data breach,” in Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems , Montreal, QC , 22–26. [ Google Scholar ]
  • Zhu Y.Q., Chang J. H. (2016). The key role of relevance in personalized advertisement: examining its impact on perceptions of privacy invasion, self-awareness, and continuous use intentions . Comput. Hum. Behav . 65 , 442–447. 10.1016/j.chb.2016.08.048 [ CrossRef ] [ Google Scholar ]

Don't miss tomorrow's social media industry news

Let Social Media Today's free newsletter keep you informed, straight from your inbox.

site logo

Social Media, Privacy and Scams - 3 Recent Cases That Highlight the Need to Take Care

Social Media, Privacy and Scams - 3 Recent Cases That Highlight the Need to Take Care | Social Media Today

Arizona Facebook Scammers

In the first case, an Arizona woman was jailed for six years for masterminding a tax rebate scheme in which she and her compatriots used Facebook data to find and target people for identity theft.

It works like this:

  • Scammers search through Facebook for data on likely targets - in this case, they targeted unemployed people in their local region
  • Scammers then contact the targets using information they've gleaned from those Facebook accounts, saying they're from a government agency or similar that's seeking to help them out (in this case, they claimed to be working on an economic stimulus program)
  • The scammers then push for more personal information from the targets, details they can then use to make a tax claim on their behalf, netting all profits in the process

In the Arizona scamming case, the perpetrators were able to obtain 'sensitive information' from several dozen people which they then used to make false tax claims. They targeted unemployed people to reduce the risk of detection - in many cases the victim isn't aware they've had their information stolen till they go to submit their own tax returns.

This type of identity theft is on the rise . By using personal information available via Facebook profiles, clever scammers are able to present very authentic dialogues that would suggest they are, in fact, officials who have access to your personal records and can be trusted with your data. They'll often use qualifying questions the way official outlets would - something like "can I just ask you a few questions to confirm your identity?" Then they have a qualified listing of questions and answers that they've been able to establish from your social profiles. With a specific collection of the data points they'll need, and with an offer being presented, like an economic stimulus program, you can see how unemployed people who might need a few extra bucks could be duped into giving over their data (then again, who doesn't need an extra few bucks?)

Key lesson: The strict rule of thumb highlighted by this case is that you should not be giving out any detailed personal information to anyone without a thorough understanding of who they are and where they're from. Always approach any such request with a high degree of skepticism - like Nigerian e-mail scams, if it sounds too good to be true, it most probably is.

F1 Driver Robbed

A big news story circulating earlier this week was that Formula One driver Jenson Button had had his house robbed in St Tropez. Button and his wife, Jessica Michibata, were staying in a rented holiday villa, which thieves broke into and cleaned out, taking, amongst other things, Michibata's $388k wedding ring.

While speculation around the case suggested that Button and Michibata were in the villa when it was robbed (some reports suggested the thieves had flooded the room with sleeping gas , though that suggestion has since been debunked) Amanda Connolly at The Next Web provided a different perspective on the theft - maybe the robbers knew exactly where Button and Michibata were and used that as a signal for when they should strike.

How could they have possibly known their location? Michibata is broadcasting it via her Instagram account, with all her images tagged to locations via the photomap feature.

Social Media, Privacy and Scams - 3 Recent Cases That Highlight the Need to Take Care | Social Media Today

" A lot of people aren't aware that even if you're not opting for the 'Name this Location' option when you upload to Instagram, it's still possible that the app has added your snaps to the stalkers dream that is the photomap."

While it's unlikely that, in this case, this is how the thieves determined the best time to target the couple, it again highlights an important privacy concern all social media users should be aware of - anything you post publicly that includes a note of your location is also, inadvertently, showing those who may want to target you and your belongings where you are at any given time. The same goes for images posted inside your home - if you post a picture of your latest DIY project and there's a glimpse of your expensive home studio in the background, you might just be making yourself a target.

While you don't want to be too paranoid about such risks - ideally, we'd all have the freedom to post what we want to our social accounts - it's worth taking the time to consider what information you're presenting in a public forum.

Key lesson: Ensure all your social media privacy settings are set to your personal requirements and specifications - especially location tracking - to ensure you're only sharing potentially sensitive information with people you know and trust. Also avoid posting content that might signify that you're out of the house for an extended period of time.

30 Years Jail for Facebook Posts

The last issue of note was the sentencing of a man in Thailand to 30 years jail for insulting the Thai monarchy via Facebook. The 48 year-old man was sentenced in Bangkok's military court under a rule known as lèse-majesté (injured majesty), under which anyone convicted of insulting the king, queen, heir or regent faces up to 15 years in prison on each count. The man admitted to defaming the monarchy in six separate Facebook posts - the original sentence of 60 years was reduced to 30 on admission of guilt.

While few nations have such strict punishments for content posted to social media sites, the lesson remains - always take a moment to consider everything that you're posting to your social media accounts. Whether it's a joke meme about being drunk or a naked selfie uploaded to Snapchat, all such actions are recorded and tracked, and can be attributed back to you, which may go against you in certain cases. Social media data is being used in an increasing number of applications - just this week, Facebook obtained a patent which can help lenders discriminate against certain borrowers based on the borrower's social network connections. While it may seem like over-stepping of the privacy mark, given the availability of such info it only makes sense that businesses would utilize it to improve their processes.

This is particularly relevant in the case of those looking for work - while employers need to apply some level of commonsense in assessing candidates based on their social media activity, recruiters are, most definitely, judging candidates on that content. If you're posting about how you 'hate your job', about how you're getting wasted every night, about engaging in criminal activity, that content will be used in any assessment of you, and while you might think not all employers are doing this, research shows that the growing majority are.

Key lesson: Always be aware of what you're posting on social media. If you're posting offensive, incriminating or in otherwise poor taste content, that material could be used against you in later assessments. This is going to become an increasingly prominent factor in employment decisions and recruitment strategies - in one sense, this is a good thing as it means people will be matched to jobs based on more data points than just gut feel and on-paper assessment. But it also means that you need to think before you post - which is a great rule of thumb to adhere to either way.

As our world becomes more connected and is brought closer together through social media and online communication, so too are we more exposed to mis-uses of our data and judgements based on the content we publish. In the majority, this shouldn't be a major concern - the benefits of a social media connectivity far outweigh the negative implications on most fronts - but it is something we all need to be aware of. Stories like these remind us of the need to be vigilant and responsive to potential issues as we go about our daily digital interactions.

Main image via Pan Xunbin / Shutterstock

Social Media Today news delivered to your inbox

Get the free daily newsletter read by industry experts

  • Select user consent: By signing up to receive our newsletter, you agree to our Terms of Use and Privacy Policy . You can unsubscribe at anytime.

Social Media Today newsletter example

Editors' picks

Image attribution tooltip

3 Steps for Establishing a Core Marketing Message

The key to maximizing a social media marketing strategy is to ground the approach in purpose.

The Complete Elon Musk–Twitter Saga

For months, we’ve been tracking Elon Musk’s $44 billion deal with Twitter. Now that the deal is done, we watch as the Musk-Twitter era begins.

Company Announcements

Want to share a company announcement with your peers?

Share your announcement ➔

  • Meta Gains Approval To Use UK User Posts in AI Training By Andrew Hutchinson
  • Report Looks at the Positives and Negatives of Social Media Usage [Infographic] By Andrew Hutchinson
  • Australia Proposes New Laws To Address Misinformation on Social Media Platforms By Andrew Hutchinson
  • Threads Adds Simplified Fediverse Connection Options By Andrew Hutchinson

privacy media case study

Regulator battles for data privacy: major cases against SARS, SAPS and IEC

From the State Security Agency (SSA) to the South African Revenue Service (SARS) and the Independent Electoral Commission (IEC), the Information Regulator (IR) had its hands full this past year, monitoring and enforcing compliance by public and private bodies with the provisions of the Promotion of Access to Information Act (PAIA) and the Protection of Personal Information Act (POPIA).

The IR operates as an independent body, accountable to the National Assembly and governed by the law and the Constitution. Its primary mandate is to ensure that public and private entities comply with PAIA and POPIA.

PAIA grants individuals the legal right to access information held by public and private bodies. PAIA aims to balance the right of access to information with the need to protect sensitive information, including personal privacy and national security.

POPIA, designed to protect personal information processed by organisations, sets out specific conditions for lawful data-handling. The aim is to establish minimum standards for processing personal information, ensuring that individuals’ privacy rights are respected and safeguarded.

The IR plays a critical role in enforcing these standards, overseeing public and private sector compliance with these laws.

At a media briefing last week, Advocate Pansy Tlakula, the chairperson of IR, provided updates on investigations into PAIA- and POPIA-related complaints from organised groups and individuals since the beginning of this financial year (April 2024).

State Security Agency

One prominent case involves the SSA. On 2 August, the IR issued an enforcement notice directing the SSA to release certain records, following a complaint lodged by an investigative journalist from Daily Maverick.

The journalist had requested information in June 2022 about SSA’s expenditure from 2015 to 2019, specifically relating to services procured from the African News Agency. The request sought detailed descriptions of the goods and services rendered, along with proof of deliverables.

However, the SSA failed to respond to the request within the legally required timeframe. As a result, the lack of response was deemed a refusal under PAIA.

“SSA attended to the matter after the prescribed time frame when it issued a refusal to grant access to the records, a response that was also deemed too late,” Tlakula explained.

Following an investigation and review by the IR’s Enforcement Committee, the Regulator concluded that the SSA had not provided sufficient grounds for withholding the records.

Tlakula said the agency failed to prove that releasing the information would impede justice, expose a confidential source, or compromise national security. Consequently, the IR issued an enforcement notice instructing the SSA to disclose the requested records.

The SSA has decided to challenge the Regulator’s decision in court.

Social media giants – X, Meta and Google

Still focused on PAIA infringements, Tlakula highlighted an ongoing investigation involving social media giants X, Meta, and Google. The case stems from a complaint requesting access to records on the classification of elections, risk assessments concerning South Africa’s electoral integrity, and how global policies are applied locally within these companies.

“The entities’ refusal of access to the records is based on the general presumption that PAIA does not apply extraterritorially to these private bodies, despite them conducting business in South Africa,” Tlakula explained.

The Regulator has accepted the complaints, and all three cases are under investigation.

Another high-profile case under review involves SARS. A complaint was lodged following the SARS Commissioner’s refusal to grant access to former president Jacob Zuma’s tax returns for the years 2010 to 2018.

Tlakula confirmed that “the investigation into this matter is at an advanced stage”.

Sibanye-Stillwater

The primary purpose of the IR’s Enforcement Committee is to investigate complaints, assess breaches of the law, and recommend appropriate actions, such as issuing enforcement notices or fines.

Among the cases referred to the Enforcement Committee is one involving Sibanye-Stillwater, a global precious metals mining company, and the Department of Mineral Resources and Energy.

The IR received a complaint from a human rights organisation against the mining company.

The complaint concerned a request for access to the annual compliance reports submitted by Sibanye-Stillwater to the department in respect of social labour plans for the Eastern and Western platinum mines.

“The annual compliance report to which the complainant requested access related to progress on community projects as part of their licensing requirements. The investigation report has been finalised and is being considered by the Enforcement Committee,” Tlakula said.

Gauteng Department of Health

Among the matters settled through mediation is a complaint the IR received from an investigative journalist who had written several articles about allegations of fraud and corruption at Gauteng hospitals.

Tlakula said it is alleged that this had resulted in the assassination of the whistleblower Babita Deokaran. The complaint lodged with the regulator was against the decision of the head of the Gauteng Department of Health to refuse access to records relating to scheduled payments to suppliers.

After receiving the complaint, the regulator set up a meeting with the head of the Gauteng Department of Health. During the meeting, they agreed to release the requested records.

Blouberg Municipality

With regards to enforcement notices issued related to POPIA matters, the IR said it had issued four enforcement notices since April this year – Blouberg Municipality, Lancet Laboratory, the IEC, and WhatsApp LLC.

In the case of Blouberg Municipality, Tlakula explained that this related to the unlawful processing of personal information of a former employee, whose personal information was exposed on the internet following her submission of declaration of interest containing their personal information.

Lancet Laboratory

The enforcement notice against Lancet Laboratory, a diagnostic service provider, was issued as a result of a compliance assessment, “which was necessitated by the number of security compromises that they had experienced”, Tlakula said.

According to Tlakula, the company failed to comply with the notification requirements in terms of POPIA.

“The company had also failed to notify the data subject affected by the security compromise within a reasonable time,” she said.

An enforcement notice was issued to the IEC as a result of a security compromise that occurred just before the 2024 national and provincial elections.

Tlakula said this resulted in the candidate nomination lists of the African National Congress (ANC) and the Umkhonto we Sizwe (MK) party being shared on various social media platforms.

“We initiated an assessment of their security systems on the safeguarding of personal information that they processed, and we found that they did not have adequate access control measures to protect the confidentiality of personal information in their possession,” she said.

In addition, Tlakula said, the IEC’s section 22 notification, to notify the data subject consent, was found to be inadequate.

WhatsApp LLC

The IR’s preliminary assessment in terms of POPIA has revealed notable discrepancies in WhatsApp’s terms of service and privacy policies, with distinct differences between those applied to users in the European region and those for users outside Europe, including South Africans.

“The privacy safeguards for users in the European region appeared to be better than those for users in South Africa, even though the General Data Protection Regulation (GDPR) and POPIA have similar standards and protections,” remarked Tlakula.

In response, the IR has issued an enforcement notice directing WhatsApp LLC to adhere to all conditions for the lawful processing of personal information. This includes updating their privacy policy, conducting a personal information impact assessment, and complying with the provisions of PAIA, particularly regarding the maintenance of documentation for all processing operations under its responsibility.

“In this regard, the regulator dismissed WhatsApp’s argument that PAIA does not apply to it as a social network which is extraterritorial,” Tlakula added.

Read: Regulator pushes for stronger PAIA enforcement amid low compliance by public bodies

South African Police Service

The IR is currently investigating a complaint regarding alleged interference with the protection of personal information by the South African Police Service (SAPS).

Tlakula explained that the personal information in question was processed by SAPS during a criminal investigation and was subsequently disseminated through WhatsApp messages.

“Due to the sensitivity of the case and considering that this is a similar matter where personal information was leaked, the Regulator has embarked on an own-initiative investigation into the alleged interference with personal information of data subject,” Tlakula stated.

This issue has now been referred to the Enforcement Committee for further action.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Compliance & Risk Management FAIS, FICA, NCA, Privacy & Labour
  • Business School Qualifications, COB & CPD
  • Information Refinery Newsletters & Media Kit
  • Workforce Solutions EE & Skills Dev Reports
  • Regulatory Exam Body RE1 & RE5

Updated 16 September 2024

privacy media case study

  Stay at the top of your game

privacy media case study

IMAGES

  1. (PDF) Case Study on Privacy-Aware Social Media Data Processing in

    privacy media case study

  2. The Issue of Privacy in the Digital Media

    privacy media case study

  3. New SIIA Case Study Celebrates Positive Data Privacy Best Practices

    privacy media case study

  4. (PDF) Privacy Issues and Data Protection in Big Data: A Case Study

    privacy media case study

  5. facebook privacy controversies

    privacy media case study

  6. Media Case Study Template

    privacy media case study

VIDEO

  1. What is the E-Security & Privacy Media (ESPM)?

  2. Haspel Social Media Case Study

  3. Separate, It's Easy

  4. #WorldMediaTrends Report In Focus on Media Independence

  5. CASE STUDIES FOR DATA ANALYTICS IN SOCIAL MEDIA || Big data analytics || 18CS71

  6. IAB Europe’s The Great Debate

COMMENTS

  1. Top 10 Privacy and Data Protection Cases of 2021: A selection

    Inforrm covered a wide range of data protection and privacy cases in 2021. Following my posts in 2018, 2019 and 2020 here is my selection of most notable privacy and data protection cases across 2021:. Lloyd v Google LLC [2021] UKSC 50 In the most significant privacy law judgment of the year the UK Supreme Court considered whether a class action for breach of s4(4) Data Protection Act 1998 ...

  2. Case Studies: High-Profile Cases of Privacy Violation

    The settlement: In January 2018, the company entered into a settlement to pay $650,000 to resolve allegations it collected personal information from children without obtaining parental consent, in violation of COPPA. VTech was also required to implement a data security program that is subject to audits for the next 20 years. 6.

  3. Social Media & Privacy: A Facebook Case Study

    Globally, the website h as over 968 million. daily users and 1.49 billion monthly users, with nearl y 844 million mobile daily users and. 3.31 billion mobile monthly users ( See Figure 1 ...

  4. Top 10 Privacy and Data Protection Cases of 2018: a selection

    ABC v Telegraph Media Group Ltd [2018] EWCA Civ 2329. This was perhaps the second most discussed privacy case of the year. The Court of Appeal allowed the claimants' appeal and granted an interim injunction to prevent the publication of confidential information about alleged "discreditable conduct" by a high profile executive.

  5. Case Study on Online Privacy

    Companies are pitching various services for use in educational settings; those services include facial recognition technology and social media monitoring tools that use sentiment analysis to try to identify (and forward to school administrators) student posts on social media that might portend violent actions.

  6. Full article: Ethical concerns about social media privacy policies: do

    Introduction. With 4.76 billion (59.4%) of the global population using social media (Petrosyan, Citation 2023) and over 46% of the world's population logging on to a Meta Footnote 1 product monthly (Meta, Citation 2022), social media is ubiquitous and habitual (Bartoli et al., Citation 2022; Geeling & Brown, Citation 2019).In 2022 alone, there were over 500 million downloads of the image ...

  7. The Effects of Privacy and Data Breaches on Consumers' Online Self

    Given the ubiquity of personal information and online behavior collected, one of the biggest challenges facing firms is large-scale data breaches where a significant amount of data is either accidentally or deliberately released to external parties (Goode et al., 2017).In light of the recent scandals such as security breaches by Cambridge Analytica, there is a growing concern about social ...

  8. Social Media, Ethics and the Privacy Paradox

    Today's information/digital age offers widespread use of social media. The use of social media is ubiquitous and cuts across all age groups, social classes and cultures. However, the increased use of these media is accompanied by privacy issues and ethical concerns. These privacy issues can have far-reaching professional, personal and security implications.

  9. Privacy, Ethics, and Data Access: A Case Study of the Fragile Families

    This case study describes the privacy and ethics audit that we conducted as part of the Fragile Families Challenge. Our process was certainly not perfect, and we hope that by being open about it, others can improve on what we did. 26 We want to emphasize that other data stewards may reasonably come to different decisions about how to strike an ...

  10. Assessing User Privacy on Social Media: The Twitter Case Study

    The line of work closest to ours is represented by models and approaches that aim at assessing users' privacy on social media with respect to the release of personal/sensitive data and other behavioral information released in the interactions on social platforms (e.g., [1, 5, 21]).One of the first approaches addressing this problem is the one proposed by Liu and Terzi [], which is based on a ...

  11. Assessing User Privacy on Social Media: The Twitter Case Study

    This model, in particular, considers distinct features, of different kinds, that capture the level of users' exposure with respect to privacy. These features, dropped into a vector space, are used to derive a score that expresses, in a measurable way, the privacy risk of users compared to the information available on social media about them.

  12. Consumers' Privacy Concern and Privacy Protection on Social Network

    Abstract. Information privacy and disclosure have been prominent issues revolving around social media. We adopted communication privacy management theory, the persuasion knowledge model, and the technology acceptance model and conducted a survey with 526 subjects and examined their privacy management on Facebook and the conditions upon which their decision to reveal or withhold private ...

  13. Social Media Case Law: Expectation of Privacy in US v. Meregildo

    That is one reason we are a recognized leader in the field of social media investigations. Contact our social media investigators at Bosco Legal Services, Inc. to discuss your situation in greater detail. Based in California, we provide social media investigations nationwide. Call our office today at (877) 353-8281 to discuss your situation.

  14. The Battle for Digital Privacy Is Reshaping the Internet

    Now that system, which ballooned into a $350 billion digital ad industry, is being dismantled. Driven by online privacy fears, Apple and Google have started revamping the rules around online data ...

  15. Privacy perception and protection on Chinese social media: a case study

    In this study, the under-examined area of privacy perception and protection on Chinese social media is investigated. The prevalence of digital technology shapes the social, political and cultural aspects of the lives of urban young adults. The influential Chinese social media platform WeChat is taken as a case study, and the ease of connection, communication and transaction combined with ...

  16. Views of data privacy risks, personal data and digital privacy laws in

    1. Views of data privacy risks, personal data and digital privacy laws. Online privacy is complex, encompassing debates over law enforcement's data access, government regulation and what information companies can collect. This chapter examines Americans' perspectives on these issues and highlights how views vary across different groups ...

  17. A Survey on Privacy in Social Media: Identification, Mitigation, and

    Publishing user-generated data risks exposing individuals' privacy. Users privacy in social media is an emerging research area and has attracted increasing attention recently. These works study privacy issues in social media from the two different points of views: identification of vulnerabilities and mitigation of privacy risks.

  18. Social Media Users' Legal Consciousness About Privacy

    The ways in which legal concepts, such as privacy, are socially interpreted and the ways in which citizens deal with them is the subject of the study of legal consciousness. We are interested in the ways in which people negotiate legal dimensions of their privacy (and violation thereof) as social media users, in their everyday life.

  19. Invading Privacy

    Other views on the Christine Busalacchi case. Case studies about invading privacy.

  20. Teens, Social Media, and Privacy

    For the five different types of personal information that we measured in both 2006 and 2012, each is significantly more likely to be shared by teen social media users in our most recent survey. Teen Twitter use has grown significantly: 24% of online teens use Twitter, up from 16% in 2011. The typical (median) teen Facebook user has 300 friends ...

  21. Research on the influence mechanism of privacy invasion experiences

    The current study found that social media users' privacy invasion experiences have a significant positive effect on their response costs, and the increase in response costs will in turn increase individuals' privacy protection intentions. ... How self-regulation creates distinct values: the case of promotion and prevention decision making. J ...

  22. Social Media, Privacy and Scams

    Every now and then a news story comes up that reminds you just how exposed we all are via social networks. Along these lines, there were three news stories this week that acted as something of a refresher on the ever-shrinking data proximity of our connected world. Each case, in a different way, serves as a reminder of the need to be vigiliant in what we say, what we do and how we respond via ...

  23. Federal Court of Appeal sides with privacy watchdog in long-running

    OTTAWA — A panel of judges says Facebook broke federal privacy law by failing to adequately inform users of the risks to their data upon signing up to the popular social media platform. The Federal Court of Appeal found that Facebook, now known as Meta, did not obtain the meaningful consent required by the Personal Information Protection and ...

  24. Regulator battles for data privacy: major cases against SARS, SAPS and

    At a media briefing last week, Advocate Pansy Tlakula, the chairperson of IR, provided updates on investigations into PAIA- and POPIA-related complaints from organised groups and individuals since the beginning of this financial year (April 2024). State Security Agency. One prominent case involves the SSA.

  25. Women moved by defiant Gisèle Pelicot in France mass rape trial

    In French law, rape is sexual penetration obtained by constraint, violence or surprise - and Gisèle Pelicot's lawyers are expected to argue that "surprise" covers the case of a sedated or ...