Privacy Policy

Dated: November 15, 2025

Version: 1.0


Introduction & Purpose of Policy

This Privacy Policy ("Policy") describes how otoearth, Inc. ("otoearth," "we," "our," or "us") collects, uses, stores, discloses, and safeguards information obtained from individuals who access or use our conversational data collection platform, participate in research activities, submit audio recordings, or otherwise interact with our websites, applications, or services (collectively, the "Platform"). The purpose of this Policy is to provide a clear explanation of the types of information we gather, the reasons why the information is collected, how it may be processed or shared, and the rights and choices available to participants. As part of our research mission, we collect conversational speech data and related information to support scientific, academic, and commercial development in the fields of speech processing and conversational artificial intelligence. This Policy is intended to help you make informed decisions regarding your participation and to understand how your data may be used both now and in the future.

Scope & Applicability

This Policy applies to all information collected by otoearth through the Platform, including information provided directly by participants, automatically collected data, audio recordings, metadata, demographic details, survey responses, and any other information generated or transmitted during your use of the Platform. This Policy applies regardless of whether you submit recordings, whether you choose to provide optional information, or whether you access the Platform as a registered participant or as a visitor. The Policy also governs information processed by third parties acting on our behalf, including hosting providers, data processors, analytics partners, and research collaborators who assist us in collecting, storing, analyzing, or distributing research datasets. This Policy does not apply to independent third-party websites, applications, or services that are not controlled by otoearth, nor does it apply to aggregated or de-identified datasets that no longer contain personal information and may be shared without restriction. Use of the Platform signifies your acknowledgment and agreement that your information will be processed in accordance with this Policy.

Categories of Information Collected

We may collect the following categories of information when you access, use, or participate in activities through the Platform:

How We Collect Information

We collect information through a combination of methods designed to support research, ensure Platform functionality, and improve user experience. Information is gathered directly from you when you create an account, provide optional demographic data, participate in a recorded session, complete surveys, or communicate with us. We also collect information automatically through your use of the Platform, including technical data transmitted by your device, browser, and internet connection. During recorded conversations, we collect audio data, session identifiers, timestamps, topic selections, and other metadata associated with the recording process. In addition, we may generate derived data-such as transcripts, embeddings, annotations, or quality-review notes-through automated systems, human reviewers acting under our authority, or machine learning tools used to prepare datasets. We may also obtain information from third-party vendors and service providers who support Platform operations, hosting, analytics, or storage, and who process information on our behalf under written agreements.

How We Use Information

We use the information collected through the Platform to operate, maintain, and improve the functionality and security of our systems; to facilitate participant communication and conversational matching; and to support our research and development activities. Specifically, we use your information to create, process, analyze, and maintain conversational datasets; to train, evaluate, and refine speech-processing and conversational AI models; to ensure compliance with study requirements; to maintain the integrity and reliability of recording sessions; and to support scientific, academic, and commercial projects that rely on natural conversational speech data. Technical and device information is used to detect anomalies, prevent fraud or misuse, troubleshoot errors, and maintain Platform performance. Optional demographic information may be used to study patterns in speech, support fair and representative model development, and contribute to de-identified datasets. We may also use your information to comply with legal obligations, enforce our Terms, communicate updates, or respond to user inquiries. De-identified and aggregated datasets may be used and shared for any lawful purpose without further notice or restriction.

Biometric Information Disclosure & Notice

Because audio recordings inherently contain unique vocal characteristics that may be classified as biometric identifiers or biometric information under certain state or local laws, we provide the following disclosure. When you participate in a recorded session, we collect and process your voice, speech patterns, accent, and other vocal attributes that may be used to identify you or distinguish you from other individuals. By using the Platform, you acknowledge and consent to the collection, storage, analysis, and processing of biometric information contained in your recordings for research, development, and data-processing purposes as described in this Policy and in the Research Consent documents.

otoearth may analyze biometric information to create transcriptions, embeddings, annotations, acoustic features, or other machine-readable representations used to train and evaluate AI models. Biometric information may be stored for the duration of our research needs and may be shared in de-identified or transformed form with research partners, institutions, or commercial entities in accordance with applicable law. We do not sell biometric identifiers, but de-identified data derived from biometric information may be licensed or made publicly available for research.

Where required by law, we will apply additional protections to biometric information, including purpose limitation, restricted access, and defined retention periods. By participating, you acknowledge that your consent to the collection and processing of biometric information is voluntary, informed, and may not be revocable with respect to data already processed, de-identified, or incorporated into datasets or models.

Public Release of De-Identified Data

We may publicly release de-identified or aggregated data derived from information collected through the Platform for the purpose of advancing scientific research, academic study, machine-learning innovation, and the development of speech and language technologies. Public releases may include, without limitation, de-identified audio recordings, transcripts, linguistic or acoustic features, annotations, demographic groupings, derived embeddings, statistical summaries, and other research-related datasets that support open collaboration and technological progress.

Prior to public dissemination, we implement measures intended to remove or obfuscate direct personal identifiers such as your name, contact information, account credentials, and other data that could reasonably be used to directly identify you. We may also apply technical de-identification techniques such as segmentation, cropping, redaction, temporal transformations, or the removal of metadata. While these measures are designed to reduce the likelihood of re-identification, no de-identification method can guarantee absolute anonymity, particularly in the context of voice data. Voice recordings may contain characteristics-such as timbre, accent, cadence, speech patterns, mannerisms, or personal references-that could be recognized by individuals familiar with you.

By participating in the Platform, you acknowledge and agree that your submitted recordings, related information, and derivative features may be included in de-identified or aggregated datasets that may be shared publicly, published, distributed, or licensed to third parties. Once such datasets are released publicly or shared with external researchers, institutions, or commercial entities, those releases are irrevocable, and it may not be possible to retrieve or delete your contributions from copies that have been accessed, downloaded, or incorporated into downstream research or systems.

Data Sharing & Third Parties

We may share information collected through the Platform with a range of third parties who assist us in research, data processing, infrastructure management, analytics, or technology development. These third parties may include cloud service providers; hosting vendors; data storage providers; academic institutions; nonprofit research groups; commercial partners; subcontracted analysts; machine-learning engineers; independent scientific collaborators; and organizations involved in the training, evaluation, or deployment of audio-based or conversational AI models.

Where personal or sensitive data is shared with third parties, we do so under written agreements that impose confidentiality obligations, restrict use to authorized purposes, and require the implementation of reasonable administrative, technical, and physical safeguards. These agreements may take the form of data processing agreements, subcontractor agreements, research collaboration agreements, or other binding documentation depending on the nature of the relationship.

We may also share de-identified recordings and derivative datasets-such as transcripts, linguistic features, acoustic parameters, embeddings, or aggregated demographic information-with research institutions, universities, technology companies, and the broader scientific community. These datasets may be shared under open licenses, restricted licenses, or collaborative research terms. Because de-identified and aggregated information does not reasonably identify an individual, its sharing may not be subject to the same limitations that apply to personal information.

In some circumstances, we may be required to disclose information to comply with law, regulation, subpoena, court order, national security request, or other lawful process. We may also share information if necessary to protect our legal rights, investigate potential violations of Platform rules, prevent harm, or respond to claims involving misuse of the Platform.

We do not sell personal information. However, we may license or distribute de-identified datasets, derivative works, or models trained on such data for research or commercial purposes. The privacy practices of third parties are governed by their own policies, and we are not responsible for their independent data-handling actions.

International Data Transfers

Because otoearth collaborates with research institutions, technical partners, and service providers located around the world, information collected through the Platform may be transferred to, stored in, or processed in jurisdictions outside of your state or country of residence. These jurisdictions may have data protection laws that differ from-and may offer less protection than-those in your home location. By participating in the Platform, you consent to the international transfer of your information, including audio recordings and biometric-derived features, for the purposes described in this Privacy Policy and related research materials.

Where required by applicable law, we implement appropriate safeguards to protect international data transfers. These safeguards may include standard contractual clauses approved by regulatory authorities, data-processing addenda, cross-border transfer agreements, and other legally recognized mechanisms designed to ensure an adequate level of protection. We may also rely on your explicit consent where permitted.

De-identified or aggregated data, including processed recordings, derived features, and research outputs, may be shared globally without restriction, as such data does not reasonably identify an individual. You acknowledge that once data has been de-identified or incorporated into research publications, AI models, datasets, or shared repositories, it may not be possible to retract, modify, or delete that information from downstream systems.

You further acknowledge that data transmitted over the internet-including during recording sessions-may pass through servers or networks located in multiple countries. While we implement reasonable safeguards to protect your information, no transmission or storage method is completely secure, and we cannot guarantee the security of information transferred outside your jurisdiction.

Data Retention

We retain information collected through the Platform-including audio recordings, transcripts, annotations, demographic information, technical logs, and derived features-for as long as necessary to fulfill the research, development, analytical, operational, or legal purposes outlined in this Privacy Policy and the related consent documents. Because this project involves the creation of long-term research datasets and machine-learning models, certain categories of information may be retained indefinitely, particularly where they have been de-identified, aggregated, or incorporated into scientific or commercial outputs that cannot feasibly be modified or withdrawn.

Personal information, such as account details or optional demographic information, may be retained for shorter periods and deleted or anonymized once it is no longer required for account management, study oversight, legal compliance, security purposes, or quality assurance. System logs, device metadata, and usage records may be preserved for audit, fraud prevention, security monitoring, or research integrity.

Submitted audio recordings and any derivative data generated from them-including transcripts, speech features, acoustic embeddings, model weights, labeled datasets, and analytical outputs-may be stored for extended periods to support reproducibility, research verification, model retraining, and ongoing scientific analysis. Once recordings or derived data have been de-identified, aggregated, published, or shared externally, they cannot be recalled or removed from downstream repositories.

We may also retain information as required by applicable laws, regulatory requirements, subpoenas, litigation holds, or legal obligations. When information is no longer required for any legitimate purpose, we may delete it, anonymize it, or store it in an archived and access-restricted form for documentation or compliance purposes.

Data Security Measures

We implement a combination of administrative, technical, and physical safeguards designed to protect the information collected through the Platform against unauthorized access, disclosure, alteration, loss, or destruction. These safeguards may include encryption of data in transit and at rest, multi-layer authentication controls, access restrictions based on job role, network segmentation, secure cloud storage, regular vulnerability assessments, and continuous monitoring of infrastructure for anomalous activity.

Access to recordings, transcripts, and research data is limited to authorized personnel, contractors, or researchers who require such access to fulfill their responsibilities. All such individuals are subject to confidentiality obligations and are expected to adhere to data-handling protocols, security best practices, and relevant regulatory requirements.

Despite our efforts to maintain a secure environment, no method of transmission or storage-whether digital or physical-can guarantee absolute security. Voice recordings inherently contain biometric characteristics and personal traits that cannot be fully anonymized prior to processing, and the nature of internet communication means data may transit through multiple networks or jurisdictions. You acknowledge that participating in the study involves certain unavoidable security risks.

In the event of a data breach that creates a material risk to your personal information and triggers legal notification obligations, we will provide notice in accordance with applicable laws. Such notice may include details regarding the nature of the breach, categories of affected information, steps taken in response, and recommended actions you can take to protect yourself.

Cookies & Tracking Technologies

We and our service providers may use cookies, web beacons, pixels, local storage objects, session identifiers, analytics tools, and similar tracking technologies ("Cookies") to support the operation, functionality, and security of the Platform. These technologies enable us to maintain sessions, authenticate users, prevent fraud or misuse, analyze system load and performance, and understand how participants interact with the Platform's features-such as recording tools, topic selection, and account management.

Some Cookies are essential for the Platform to function and cannot be disabled without affecting your ability to use core features. These may include authentication cookies, load-balancing cookies, security tokens, and cookies that store your recording session state. Other Cookies are used for analytics, research optimization, and performance monitoring; these help us improve our systems, enhance the recording experience, and support model development and quality assurance efforts.

We may use third-party analytics services-for example, cloud-hosted diagnostics or usage telemetry-to collect aggregated or pseudonymized information about user interactions, device characteristics, browser types, IP addresses, timestamps, feature usage, and error events. These third parties may process information in accordance with their own privacy policies, and we implement contractual safeguards where required.

You may be able to manage or disable certain Cookies through your browser settings; however, disabling some Cookies may impair functionality, prevent you from accessing certain features, or disrupt recording sessions. By using the Platform, you consent to the use of Cookies and tracking technologies as described in this Privacy Policy.

Human Review & Quality Assurance

Certain aspects of the Platform's processing-including the review, labeling, evaluation, or validation of submitted audio recordings-may involve limited human review. Human review is used to ensure data quality, monitor system performance, correct transcription or processing errors, verify appropriate use of the Platform, and support the development, testing, or refinement of speech-processing technologies. Human reviewers may include employees, contractors, or third-party research partners who are subject to confidentiality obligations and who receive access only to the minimum amount of information necessary to perform their assigned tasks.

Recordings may be reviewed to identify corrupted files, evaluate audio clarity, remove background noise, flag rule violations, confirm the absence of prohibited content (such as personally identifiable information), and validate model outputs or annotations generated by automated systems. Reviewers may also generate structured labels, metadata, or annotations that form part of training or evaluation datasets used in machine-learning research.

Human review may occur even if a recording is later processed into de-identified or aggregated form, because certain quality-assurance workflows require access to the raw audio before downstream de-identification workflows are applied. All human review is conducted under strict internal protocols designed to minimize privacy risks, limit exposure of personal information, and ensure compliance with legal and ethical standards applicable to research data.

User Rights

Depending on your jurisdiction, you may have rights regarding the personal information collected about you through the Platform. These rights are subject to limitations, exemptions, and operational constraints, especially given the nature of the research, the use of de-identified datasets, and the technical impossibility of recalling data once incorporated into machine-learning models or publicly released datasets.

You may have the right to access certain information we maintain about you, request that we correct inaccuracies in your account or profile information, or request deletion of personal information that has not yet been de-identified, aggregated, or incorporated into downstream datasets. You may also have the right to request information about the categories of data collected, the purposes of processing, and the types of third parties with whom data is shared.

Please note that we cannot remove or modify recordings, features, transcripts, or datasets that have already been de-identified, released publicly, shared with research partners, or used to train machine-learning models, as these materials can no longer be linked back to an identifiable individual and are often technically or practically irreversible.

To exercise your rights, you may contact us using the information provided at the end of this policy. We may request additional information to verify your identity before responding to your request. We may deny or partially fulfill a request where permitted by law, including where fulfilling the request would interfere with research integrity, compromise security measures, or impose disproportionate burdens on the research process.

State-Specific Privacy Rights

Residents of certain U.S. states-such as California, Virginia, Colorado, Connecticut, Utah, Texas, Oregon, Montana, Delaware, Iowa, and Tennessee-may be entitled to additional privacy rights under their respective state privacy laws. These rights generally relate to accessing personal information, requesting deletion or correction of personal information, requesting information about data practices, opting out of certain forms of data sharing, and obtaining a portable copy of data. While the specific definitions and requirements vary by state, we will honor all applicable state rights to the extent they apply to the information we maintain in identifiable form. Because our Platform primarily processes conversational audio for research, de-identification, dataset development, and machine-learning training, many state privacy laws also contain explicit exemptions for research data, de-identified information, publicly available data, and information used solely for scientific or statistical purposes. As a result, certain rights may not apply to portions of the data we process, particularly after recordings have been de-identified, aggregated, or incorporated into derived outputs, datasets, or model-training workflows.

Depending on your state of residence, you may have the right to request access to certain categories of personal information we collect, request deletion of information you provided, request correction of inaccurate personal information, or obtain information about how we use and disclose certain categories of personal data. Some states provide rights to opt out of targeted advertising, certain types of data sharing, or "sales" of personal information; however, we do not engage in targeted advertising, automated profiling that produces legal or similarly significant effects, or the sale of personal information as defined by most state laws. We may, however, license or distribute de-identified datasets for research or commercial purposes, and such distribution is exempt from state opt-out requirements. You may also have the right to receive a portable copy of certain information in a structured, machine-readable format, although this right applies only to identifiable information and does not require us to reverse any de-identification processes or extract data from machine-learning systems.

Because of the nature of our Platform, certain rights cannot be honored once recordings have undergone technical transformations. After audio recordings have been de-identified, processed into embeddings or derived features, shared with research partners, incorporated into training datasets, or used to train machine-learning models, we cannot withdraw, modify, or delete that information. State privacy statutes generally recognize these limitations and exempt de-identified and research data from deletion, correction, and most opt-out requirements. These exemptions apply because de-identified information cannot reasonably be linked to an identifiable individual, and because reversing trained model outputs or recalling publicly shared research data is technically or practically impossible.

If you reside in a state that provides statutory privacy rights and wish to exercise one of those rights, you may submit a request using the contact information provided at the end of this Privacy Policy. We may need to verify your identity before processing the request, and we may deny a request where permitted by law-including where the request applies to data already de-identified, where fulfillment would compromise research integrity, where an exemption applies, or where the request is manifestly unfounded or excessive. If your state requires an appeal process and your request is denied, you may submit an appeal, and we will respond within the timeframe required by applicable law.

We will comply with all legally required response timelines, typically within 45 days with the possibility of lawful extensions. To ensure transparency, we note again that we are not required to re-identify data solely to fulfill a request, and we cannot remove or restrict de-identified information, publicly released datasets, or any materials already incorporated into trained models or research outputs. These limitations reflect both legal exemptions and the technical realities of how conversational AI datasets are developed and utilized.

Children's Data

The Platform is intended exclusively for adults and is not designed for use by children or individuals under the age of eighteen (18). We do not knowingly collect, solicit, or process personal information from minors, nor do we permit individuals under eighteen to participate in recorded conversations, create accounts, or submit any form of audio, demographic information, or survey data. If we become aware that information has been collected from a minor-whether through misrepresentation of age, unauthorized participation, or fraudulent account creation-we will take reasonable steps to delete that information from our systems to the extent technically possible and will terminate the associated account or session. Because certain portions of submitted data may be processed immediately, de-identified, aggregated, or incorporated into research workflows or machine-learning models, it may not be possible to fully remove information that has already undergone downstream transformation or has been included in datasets that cannot be linked back to a specific individual. We strongly encourage parents, guardians, and supervisors to monitor and prevent unauthorized access, and we reserve the right to implement additional age-verification or residency-verification measures as necessary to maintain compliance with applicable laws and protect the integrity of our research datasets.

No Sale of Personal Information / De-Identified Data Licensing

We do not sell personal information as that term is defined under applicable U.S. state privacy laws, including the California Consumer Privacy Act (CCPA/CPRA) and similar statutes in Virginia, Colorado, Connecticut, Utah, Texas, Oregon, Montana, and other jurisdictions. We also do not engage in targeted advertising, cross-context behavioral advertising, or automated decision-making that produces legal or similarly significant effects. We may, however, generate, use, share, publish, license, or distribute de-identified, aggregated, or anonymized data derived from submitted recordings, demographic information, or Platform interactions. Such data does not identify you and cannot reasonably be re-associated with you, and therefore is not considered personal information under most state privacy laws. De-identified datasets may be provided to research institutions, universities, developers, commercial partners, or the broader scientific community for the purpose of supporting advancements in speech processing, conversational AI, and related research fields.

The licensing or distribution of de-identified data is central to the mission of the Platform and is carried out in accordance with applicable legal standards and widely accepted research ethics. These materials may be used to develop and evaluate models, train machine-learning systems, validate research findings, or support broader scientific and commercial innovation. Once such data has been de-identified and shared, it cannot be withdrawn, limited, or altered, as it is no longer tied to an identifiable individual and may have been incorporated into derivative outputs, analyses, or published materials beyond the control of otoearth. By participating in the Platform, you acknowledge and agree that de-identified data may be used and distributed without restriction and without compensation to you.

Data De-Identification Practices

We employ a range of technical, administrative, and procedural measures to de-identify data collected through the Platform before such data is used in research, distributed to collaborators, or included in publicly accessible datasets. De-identification may involve removing direct identifiers, suppressing or transforming metadata, separating demographic information from raw recordings, applying anonymization techniques, or converting audio files into non-reversible machine-learning features such as embeddings or acoustic descriptors. Our goal is to ensure that the resulting datasets cannot reasonably be used to identify an individual participant, either alone or in combination with other information likely to be available to a third party.

Because conversational audio inherently contains characteristics such as voice timbre, accent, tone, or linguistic patterns, complete anonymity cannot be guaranteed. However, once recordings are transformed into de-identified forms or integrated into aggregated datasets or machine-learning workflows, the resulting outputs cannot feasibly be linked back to you. De-identified data may undergo further processing by research partners or authorized third parties for purposes such as annotation, augmentation, model evaluation, or data-quality assessment. We do not re-identify de-identified data and prohibit our partners from attempting to re-identify individuals from such datasets.

It is important to note that de-identification is often irreversible, especially once data is incorporated into trained models, research findings, or distributed datasets. As a result, requests to delete, modify, or withdraw data cannot apply to information that has already been de-identified or used in downstream processes. These limitations reflect both industry standards and legal exemptions that recognize the scientific and operational constraints associated with research-oriented data processing.

Law Enforcement Disclosures

We may disclose personal information, recordings, or other data collected through the Platform when required to do so by law, regulation, legal process, or governmental request. This may include responding to subpoenas, court orders, warrants, regulatory inquiries, or lawful requests from public authorities. We will evaluate each request to ensure that it is legally valid and will only provide the minimum amount of information necessary to comply with the applicable legal obligation. Where legally permitted, we may notify the affected user before producing information; however, we are not obligated to do so, and in some circumstances-such as when notification is prohibited by law or would compromise an investigation-we may be legally restricted from providing notice.

We may also disclose information if we believe in good faith that such disclosure is necessary to protect the rights, safety, or property of otoearth, our users, the public, or any third party; to prevent or investigate potential wrongdoing, fraud, abuse, or security incidents; or to enforce our Terms of Service, including actions involving misuse of the Platform or violations of legal requirements. These disclosures may involve identifiable information, but once data has been de-identified, aggregated, or incorporated into machine-learning models or research datasets, it is generally not possible to isolate or produce an individual's specific contribution. We do not voluntarily disclose de-identified research data unless legally required to do so or unless such data has already been publicly released.

Third-Party Links & Services

The Platform may include links to third-party websites, applications, tools, or services, as well as integrations with third-party technologies that support hosting, analytics, research collaboration, or data processing. These third parties operate independently from otoearth and maintain their own privacy practices, policies, and terms of use. We are not responsible for the content, security, or privacy practices of these third-party services, and your interactions with them are governed solely by their respective policies. Any personal information you choose to provide directly to a third-party service is not covered by this Privacy Policy, and we encourage you to review the applicable terms and privacy notices before engaging with those services.

Certain features of the Platform may rely on third-party tools such as cloud infrastructure providers, audio processing partners, analytics services, or collaborative research platforms. These third parties may access limited information as necessary to perform contracted services on our behalf, subject to confidentiality and data-protection obligations. While we take reasonable steps to ensure that third parties handle information in accordance with applicable laws and our own requirements, we cannot control or guarantee their independent conduct. The presence of a third-party link or integration does not imply endorsement, sponsorship, or affiliation, and otoearth is not responsible for any harm, loss, or data misuse arising from your interactions with third parties outside the scope of our controlled processing.

Changes to This Privacy Policy

We may update or modify this Privacy Policy from time to time to reflect changes in our data practices, the development of new features, evolving research activities, updates to applicable laws and regulations, or adjustments to our operational or organizational structure. When we make material changes, we will provide notice by updating the "Last Updated" date at the beginning of this Privacy Policy, posting the revised version on the Platform, or delivering additional notice as required by law-such as through email or in-application alerts, where feasible.

Any changes to this Privacy Policy will take effect when they are posted unless a later effective date is expressly specified. Your continued use of the Platform following the posting of an updated Privacy Policy constitutes your acceptance of the revised terms. If you do not agree with any changes, you should discontinue use of the Platform and, where applicable, request deletion of your account. Updates to this Privacy Policy will not retroactively alter rights that have already vested with otoearth regarding previously submitted or de-identified data, including information incorporated into research datasets or machine-learning models, as such downstream uses are irreversible and exempt under applicable legal frameworks. We encourage you to review this Privacy Policy periodically to stay informed about how your information is handled.


Contact Information

If you have any questions about this Privacy Policy, please contact us at:

Email: consome@oto.earth