Skip to main content
Markkula Center for Applied Ethics Homepage

VR Motion Data Privacy and Re-identification Risks

Julia A Scott, Ph.D, Bhanujeet Chaudhary, Aryan Bagade

1. Study Description

This study, titled “Measuring Distractions in Extended Reality,” investigates the interaction between virtual and real objects in a mixed-reality environment and its impact on cognitive load and task performance. The study explores privacy risks, ethical concerns, and technical challenges associated with motion data collection and participant re-identification in virtual reality (VR) environments.

More background information on motion data may be found in the Appendix.

Research Objective

The main objective is to assess how users manage real and virtual objects simultaneously in the context of solving a virtual puzzle. By analyzing these interactions, the study aims to:

  1. Measure cognitive load, task performance, and levels of distraction.
  2. Determine privacy risks associated with motion tracking data and its potential for re-identification.
  3. Investigate the implications of VR data usage in research, focusing on ethical standards, informed consent, and participant data security.

Key research questions include:

  • How does interaction with real and virtual objects affect cognitive load and task efficiency?
  • Can VR motion data be used to re-identify users, despite anonymization efforts?
  • What are the ethical and privacy challenges associated with reanalyzing VR motion data?

Findings from previous research

Despite anonymization efforts, machine learning models demonstrate a high probability of re-identifying individuals based solely on their motion data. The same group demonstrated that motion tracking data collected in VR environments can be used to identify individuals with over 95% accuracy using less than 5 minutes of tracking data. The high accuracy of re-identification was observed across different tasks, whether participants were engaged in simple viewing tasks or more interactive tasks involving hand controllers.

Ethical Challenges

The open-ended query for this case study seeks to explore the potential risks associated with body data, such as motion tracking, in research settings, including safety protocols, user experiences, and data management.

  1. How does this research pose ethical challenges, particularly in terms of participant anonymity and machine learning?
  2. Where do you suspect there are vulnerabilities to protecting participants with the data collected in the study?

Collect your group's thoughts on the Open Inquiry Notes

2. Technical Description

Participant Data from Surveys

  • Presence: Survey data on ‘presence’ reflects participants’ perception of realism and immersion within the VR environment. Higher presence scores typically indicate a deep engagement, where participants might interact more intuitively with virtual objects. This data helps assess whether a heightened sense of immersion correlates with unique motion behaviors, potentially impacting re-identification risks. 
  • Task Load: Task load data offers insights into the cognitive effort required for each VR task, helping to identify patterns in user movements associated with high or low task demands. Higher task load might correlate with specific motion patterns, such as increased head movements or changes in hand gestures, that could impact privacy and re-identification likelihood. 
  • User Experience: User experience data, including enjoyment, frustration, and comfort, helps gauge the overall effectiveness of VR interactions and identify any usability issues that could influence motion patterns. Behavioral insights from user experience can reveal if specific emotional or comfort levels lead to identifiable physical responses, such as changes in posture or gesture frequency. 
  • Demographics: Demographic information, such as age, gender, and VR experience, provides a basis for analyzing differences in motion patterns across diverse participant groups. For example, prior VR experience may make some participants more efficient in VR interactions, potentially reducing identifiable gestures. 

Participant Data from Motion Tracking

The study utilized VR headsets (Meta Quest 3) and controllers to gather motion data from participants, capturing six degrees of freedom (6DOF), which includes both positional (X, Y, Z) and rotational (yaw, pitch, roll) data for head and hand movements (Figure 1). For instance, five minutes of motion data results in approximately ~4.32 MB (20 data points per second × 72 bytes per point × 300 seconds), allowing for a precise and comprehensive understanding of user interactions within the virtual space. The coordinates are used to generate a path reconstruction as demonstrated here in a handwriting task.

A 2D plot with head position in the x an y axes. Two points plotted representing the right and left controllers. Source: Mark Miller

Figure 1. Two-dimensional representation of VR controllers in space, based on position and orientation from inertial movement units (IMU). The IMU data is essential for the functionality of user interactions in VR. Source: Mark Miller

Data Analysis

By integrating survey insights with motion data in deep learning models, the study seeks to understand how factors like task complexity or emotional response influence physical interactions in VR. This behavioral data link provides a deeper understanding of how typical VR experiences might inadvertently reveal identifiable characteristics. For example, characteristics of object interactions in the puzzle task may distinguish between right and left handed participants or male and female participants. The analysis can reveal fundamental vulnerabilities that would inform data processing strategies to reduce risk of re-identification.

Technical Analysis

  1. Which data types pose privacy concerns? Are these conventionally considered sensitive data?
  2. Does the use of machine learning applied to neutral activity (e.g. puzzles) change the nature of the data?

Collect your group's thoughts on the Technical Analysis Notes

3. Risk Analysis

Ethical Review Considerations

The study protocol was reviewed by the university the Institutional Review Board (IRB), which categorized the study under Behavioral Science. This determined the scope of the review and the committee composition. The IRB protocols included:

  1. Expedited Review: Given the minimal safety risk of the VR study and the non-personal data collected, the IRB applied an expedited review.
  2. Informed Consent: Participants were provided with detailed informed consent forms, which included options for audio and video recording during the study. These forms were designed to ensure that participants fully understood the nature of the study, the risks involved, and how their data would be used and stored.
  3. Participant Welfare: In the study, the welfare of participants was monitored by verbal questions by the experimenter. Participants were reminded they could withdraw at any time.
  4. Compensation: Participants were given a $15 at the completion of the study.
  5. Data Management Plan: All collected data, including video recordings of participant interactions, were securely stored on a password-protected university server. The protocol mandated that all local copies of the data be deleted after successful upload to ensure data security.

Consent Process

All participants provided informed consent explicitly permitting the collection and use of their motion tracking data for this study. The consent form was designed to communicate the scope of data collection and potential risks transparently. It detailed how the data would be used within the study and outlined the possibility of future data repurposing, as advancements in analysis techniques may allow additional insights to be drawn from the data. Participants were informed of the potential risks of re-identification through machine learning, especially as VR motion data can contain unique movement patterns that are challenging to fully anonymize.

In compliance with ethical guidelines, participants were given the choice to opt out of future data analysis, ensuring they could retain control over their data beyond the initial study. Of the 38 participants, none opted out of future use of data and three declined video recordings.

I give consent to be video recorded during this study: I give consent for video recordings resulting from this study to be used for behavioral coding:  I give consent for motion data recordings (not video) resulting from this study to be use for future research by the research team:  I give consent for motion data recordings (not video). Source: Mark Miller

Figure 1. Itemized informed consent. Participants reserved the right to agree to each of the terms separately and still participate in the study. Source: Mark Miller

Data Management Protocols

The study group described their privacy protections as follows:

"Motion data collected by VR has the potential to be re-identified like someone's face or hand. The motion data will not be shared outside of the research team. No minors will have the option to allow re-use of their data; adult participants that do not consent to future use of the same data will have their data destroyed once the research is published. Additionally, the tracking telemetry programs enabled by default on the development platform will be disabled."

To address the privacy concerns associated with VR motion tracking data, the study implemented data management protocols designed to anonymize and secure all collected information. Key measures included:

  • Data Anonymization: Motion data underwent an anonymization process to remove identifiable links, such as participant IDs or timestamps. However, the study acknowledged that current anonymization techniques may not fully prevent re-identification, particularly when patterns unique to an individual’s movement are present. Therefore, additional steps were taken to minimize identifiable data properties.
  • Encrypted Storage: All motion tracking data was stored on encrypted servers with restricted access, ensuring that only authorized personnel could view or analyze the data. Data encryption helps protect against unauthorized access, adding a layer of security against data breaches or leaks.
  • Periodic Data Audits: The study conducted regular data audits to monitor data integrity, review access logs, and ensure adherence to privacy protocols. Audits help maintain compliance with security policies and identify any potential vulnerabilities in data storage or access.

Risk Assessment

  1. Does recoding to Study ID sufficiently protect participant anonymity?
  2. Conventionally, identifying metadata about a participant (like street name of residence) in combination with less sensitive data (like diagnostic category) is considered sensitive as the likelihood to re-identification increases. Should motion tracking data in combination with other data types be treated in a similar manner?
  3. Was the informed consent as described sufficient to educate the participants prior to enrolling in the study? How?

Collect your group's thoughts on the Risk Assessment Notes

4. Conclusion

This study builds on previous research by extending the analysis to activities in VR that are not linked to a user account, highlighting the vulnerability of motion data to re-identification in typical VR use cases that generate distinct movement patterns. This is relevant to consumer privacy since technology companies have the ability to characterize and identify users by their motion data. The novelty of this research lies in the demonstration that motion data collected passively in non-sensitive contexts can still be used to accurately identify individuals.

The study underscores the need for advanced anonymization techniques—such as data perturbation or synthetic data generation—to further reduce the risk of re-identification in future research. Researchers recommend exploring these methods for subsequent studies to enhance participant privacy without compromising data quality.

Mitigation Approach

Table with column headers Risk Category, Description, Stakeholder Responsible, Mitigation Strategies with respective lists of Data Privacy, Data Use, Informed consent, then Handling sensitive motion data, Current and future, Clearly communicating data privacy risks, then Researchers/PI, PI/IRB, PI/IRB, then Annonymization, data perturbation, encrypted storage, Report planned and potential analytic

(Image: Table with column headers Risk Category, Description, Stakeholder Responsible, Mitigation Strategies with respective lists of Data Privacy, Data Use, Informed consent, then Handling sensitive motion data, Current and future, Clearly communicating data privacy risks, then Researchers/PI, PI/IRB, PI/IRB, then Annonymization, data perturbation, encrypted storage, Report planned and potential analytics, Itemized data usage descriptions. Source: Scott, Chaudhary, Bagade, 2024.)

Mitigation

  1. What did the researchers and IRB do to protect participants anonymity? Would you reccomend additional steps or a different approach?
  2. Do you think machine learning studies for unique classification (N of 1) with VR data is human subjects research? Why?

Collect your group's thoughts on the Mitigation Strategy Notes

Acknowledgments

The research team would like to acknowledge the Illinois Institute of Technology research group that shared their study materials and consulted for this project.

Appendix

Data Utilization for Motion and Behavior Analysis

Motion tracking data, specifically 6DOF data, provides rich insights into participant behaviors, capturing attributes such as speed, smoothness, gait, and even subtle breathing patterns through path tracing from positional data. These variables help researchers analyze cognitive load and task performance by observing how participants manage real and virtual objects. Such data is essential in applications like:

  • Performance Diagnostics: Data is valuable for evaluating cognitive efficiency and motor precision, particularly relevant for user experience research and VR platform improvements.
  • Health Assessments: Patterns in motion can be utilized to assess motor functions or conditions that affect movement, contributing to diagnostics for neurological or motor disorders.
  • Human Interaction and Robotics Modeling: These insights aid in modeling human interactions within VR, useful for applications in robot training, virtual human simulation, and rehabilitation scenarios.

Data Privacy Risks

  • Re-identification: Motion data inherently contains unique behavioral patterns that, when analyzed, can be linked back to individual users. Despite anonymization efforts, machine learning models have demonstrated a high probability of re-identifying users based on these motion patterns. Studies, including Nair et al., 2024, indicate that less than five minutes of motion data can re-identify a participant within a dataset with over 95% accuracy. This risk is amplified by the potential for inference, where user behavior could reveal sensitive details about cognitive states or physical characteristics. In a follow up study, as little as 30 seconds of tracking data during a structured activity enabled re-identification with an accuracy of 98% within a dataset of 500 participants. The high accuracy stems from the distinctive, habitual patterns present in an individual’s movements—like unique gestures, walking styles, or interaction techniques—which persist even in anonymized datasets, making re-identification a significant privacy risk in VR research.
  • Data Sharing Between Labs and Organizations: When motion data is shared between research institutions, organizations, or technology partners, the risk of privacy breaches escalates. Anonymization techniques may degrade over time or when datasets are combined, making it possible for researchers to re-identify participants, either intentionally or inadvertently. Without robust protocols to control access, maintain data integrity, and enforce strict anonymization standards, participants may face exposure to privacy violations. Collaborative research agreements should include clauses that specify acceptable data handling, secure storage requirements, and restrictions on re-identification practices.
  • Future Analysis Risks: Future analysis of VR motion data—especially as machine learning and AI capabilities evolve—poses risks beyond the initial study context. Motion data that appears harmless today could potentially be used to infer sensitive personal information, such as emotional states, health conditions, or physical disabilities. Given the rapid advancements in technology, data re-use may lead to unintended and unauthorized insights into participants’ personal lives, creating long-term privacy implications. Ethical research practices must consider mechanisms for participants to review or revoke consent for future data analysis.
  • Platform and Device-Dependent Terms of Use and Data Policies: The devices and platforms used to collect VR motion data come with terms of service that may conflict with privacy guarantees offered by researchers. For example, Meta’s data policy allows extensive use of user-generated content by both the company and third-party affiliates. Such terms may undermine participants’ privacy expectations, as platform policies often permit broad data sharing, retention, and secondary use that extend beyond the research’s initial intent. Researchers must transparently communicate these terms to participants and ensure that platform/device data policies are aligned with ethical research practices to safeguard participant privacy effectively.

Data Management

Data Storage. Due to the identifiable nature of 6DOF data, this motion data should be stored on encrypted servers with restricted access. An effective approach may involve storing positional and rotational data in separate databases, only combining them for analysis within controlled environments. Additionally, hashing data sequences can allow researchers to analyze specific patterns without exposing raw data.

Data Manipulation. Augmentation and noise-injection techniques help protect participant anonymity without compromising the behavioral data’s integrity. Suggested methods include:

  • Noise Injection: Minor, randomized alterations to X, Y, Z positional data or yaw, pitch, roll rotational data can disrupt re-identification algorithms while maintaining study usability.
  • Data Subsampling: Recording data at a lower frequency (e.g., 10 Hz instead of 20 Hz) can reduce detailed tracking of unique movements.
  • Synthetic Data Generation: Machine-generated versions of real participant data can be used to preserve behavioral analysis insights while masking identifiable traits.

References

Miller, Mark Roman, Fernanda Herrera, Hanseul Jun, James A. Landay, and Jeremy N. Bailenson. “Personal Identifiability of User Tracking Data during Observation of 360-Degree VR Video.” Scientific Reports 10, no. 1 (2020).

Nair, Vivek, Wenbo Guo, Justus Mattern, Rui Wang, James F. O textquoterightBrien, Louis Rosenberg, and Dawn Song. “Unique Identification of 50,000+ Virtual Reality Users from Head & Hand Motion Data,” 895–910. (2023) Anaheim, CA: USENIX Association.

Nair, Vivek, Louis Rosenberg, James F. O’Brien, and Dawn Song. “Truth in Motion: The Unprecedented Risks and Opportunities of Extended Reality Motion Data.” IEEE Security & Privacy 22, no. 1 (January 2024): 24–32

Aug 18, 2025
--

Santa Clara University School of Engineering logo; SCU Healthcare, Innovation, and Design Program logo; MEDXRSI Logo; PRIM&R Logo.