Data Handling & Governance
This page explains how Pairaphrase handles customer content across AI Sandbox and Translation Engine workflows, including what data is submitted to supported API providers, how it is protected, and what users should know about data training, ownership, encryption, retention, and AI-generated output review.
Pairaphrase uses business/enterprise API services that are designed to protect customer data. Based on current provider documentation, customer content submitted through these services is not used to train public models by default. Exact handling practices vary by provider and service configuration.
Customer data remains private, but the exact data handling terms vary by provider, service, and configuration. This document summarizes how data is handled across the AI Sandbox and Translation Engine workflows.
How Pairaphrase uses third-party providers
Pairaphrase uses third-party providers in two main workflows:
- AI Sandbox: Uses ChatGPT API and Gemini API for prompt-based AI assistance
- Translation Engines: Uses ChatGPT API, Google Cloud Translation API, Microsoft Translator API, and DeepL API Pro for machine translation.
Some providers appear in more than one section because Pairaphrase uses them in more than one workflow. For example, ChatGPT is used in the AI Sandbox and is also available as a translation engine.
The tables below summarize data training, ownership, security, encryption, and retention practices by workflow and provider.
1. Pairaphrase AI Sandbox Data Handling
The Pairaphrase AI Sandbox allows users to submit prompts, instructions, text, or other content to supported AI APIs and receive AI-generated responses. This workflow is different from standard machine translation because users may submit broader instructions, questions, or context for AI-assisted work.
|
Feature |
ChatGPT API |
Gemini API |
|
Purpose in Pairaphrase |
Used within the Pairaphrase AI Sandbox to support AI-assisted prompt and response workflows. |
Used within the Pairaphrase AI Sandbox to support AI-assisted prompt and response workflows. |
|
Data Submitted |
User prompts, instructions, source text, reference content, and AI-generated responses submitted through the AI Sandbox. |
User prompts, instructions, source text, reference content, files if supported, and AI-generated responses submitted through the AI Sandbox. |
|
Data Training |
Business/API data is not used to train models by default. API customers may have options to share data or feedback, depending on their settings and applicable terms. (OpenAI) |
For paid Gemini API services, Google states that prompts, associated system instructions, cached content, files, and responses are not used to improve Google products. (Google AI for Developers) |
|
Ownership & Usage |
Customers retain rights to inputs and own outputs to the extent permitted by law. The provider receives rights needed to provide and maintain the service, comply with law, and enforce applicable terms. (OpenAI) |
Data is processed under Google’s applicable terms and data processing terms for paid Gemini API services. (Google AI for Developers) |
|
Security & Encryption |
Data is encrypted at rest using AES-256 and in transit using TLS 1.2 or higher. ChatGPT API/business services also support enterprise-grade security controls and SOC 2 Type 2 compliance. (OpenAI) |
Google states that paid Gemini API prompts and responses are processed under the applicable Google data processing terms. Security and encryption are governed by the applicable Google Cloud or Gemini API terms and configurations. (Google AI for Developers) |
|
Data Retention |
API inputs and outputs may be retained for up to 30 days, unless the customer has an applicable zero data retention or other eligible retention configuration. (OpenAI) |
Retention depends on the applicable Gemini API service, paid service terms, data processing terms, and customer configuration. (Google AI for Developers) |
|
User Responsibility |
Users should avoid submitting content they are not authorized to process through AI-assisted tools and should validate AI-generated output before use. |
Users should avoid submitting content they are not authorized to process through AI-assisted tools and should validate AI-generated output before use. |
2. Pairaphrase Translation Engine Data Handling
Pairaphrase translation engines are used to translate source text or documents into one or more target languages. This workflow is separate from the AI Sandbox because the primary purpose is machine translation, even when the selected translation engine is ChatGPT.
|
Feature |
OpenAI |
Google Cloud Translation |
Microsoft Translator |
DeepL |
|---|---|---|---|---|
|
Data Training |
By default, business data is not used for training unless the user explicitly opts in via feedback mechanisms. |
Prompts and responses are not used to train models. Optional data sharing in Trusted Tester programs is for product improvement, not training. |
Customer data (text/voice) submitted for translation is not used for training models. |
Texts are not used to improve the quality of services. |
|
Ownership & Usage |
Customers retain all rights to inputs and own outputs to the extent permitted by law. OpenAI only receives rights necessary to provide services and comply with the law. |
Data is handled in accordance with Google Cloud's terms of service and Cloud Data Processing Addendum (CDPA). |
Translation or use for training does not alter the ownership of the intellectual property contained in the documents. |
Texts and documents are deleted after contractually agreed services are performed. Personal data processing requires a Data Processing Agreement (DPA) and GDPR compliance. |
|
Security & Encryption |
Encrypts data at rest (AES-256) and in transit (TLS 1.2+). SOC 2 Type 2 compliant. Strict access controls and 24/7/365 security on-call rotation. |
Data is encrypted in transit to the underlying model. Adheres to Google's AI/ML privacy commitment with robust data governance and default encryption at rest. |
Meets strict security requirements. Certified for HIPAA, ISO, SOC, and FedRAMP; FedRAMP High in Azure public cloud and Azure Government. Data is not written to persistent storage. |
Connections use state-of-the-art encryption. Suitable technical and organizational security measures are employed. |
|
Data Retention |
API data is securely retained for up to 30 days to identify abuse, then removed unless legally required. Zero Data Retention (ZDR) available for qualifying use-cases. |
Text sent to the Cloud Translation API is held briefly in memory to perform the translation and return the results. |
Text translation is not stored. Document translation is temporarily stored for processing and then permanently deleted. No customer data is persisted. |
Texts and documents are deleted after the contractually agreed service is completed. Content is stored for the account if "saved translations" or "glossary" features are used. |
Sources:
View the OpenAI trust portal here and the Enterprise privacy here
View Google's data governance, Cloud Data Processing Addendum, and the Google Cloud Terms of ServiceView DeepL’s Privacy Policy, B2B Terms and Conditions and Trust Center
View Microsoft’s Confidentiality information
Note: Users should validate all AI output for accuracy before use. This summary is provided for informational purposes only and is based on publicly available third-party provider documentation as of the date of publication. Provider terms, data usage policies, security practices, and retention practices may change over time. Pairaphrase does not control third-party provider policies and cannot guarantee that this information will remain complete, current, or error-free. Customers should review the applicable provider terms before making compliance or contractual decisions.
